Real Analysis 



P. Ouwehand 



Department of Mathematical Sciences 
Stellenbosch University 



Contents 



1 An Axiomatic Development of the Real Number System 

I. 1 Why we need axioms 

II. 2 Arithmetic of Fields] 

11.3 Ordered Fieldsl 

11.4 The ContinuumI 

1.5 The Completeness Axiom 



2 Sequences and Series 

12.1 Introduction! 

2.2 Definition of Convergence 

[2.2.1 "Infinitely Often" and "Eventually" 

2.2.2 Convergence to 

2.2.3 Formal Definition of Convergence of Sequences . . 

2.3 Arithmetic, Order and Convergence 

2.3.1 Arithmetic and Convergence 

2.3.2 Order, Completeness and Convergence 

2.4 Representation of Real Numbers by Decimals 

12.5 Introduction to Series) 

12.5.1 The Paradoxes of Zenol 

|2.5.2 Convergence of Series: Definition and Examples] . . 

2.6 Convergence of Subsequences 

2.6.1 Subsequences 

12.6.2 Bolzano-Weierstrass TheoremI 

2.7 Cauchy Sequences and Completeness 

2.8 Further Results on Convergence of Series 

2.8.1 Cauchy Criteria 

2.8.2 Absolute Convergence and Rearrangement of Series 

2.8.3 More Tests for Convergence 

2.9 limsup and liminf* 



3 Basic Topology 

13.1 Introduction! 

3.2 Open and Closed Sets — Motivation 

3.3 Open and Closed Sets — Definitions and Basic Properties 

13.3.1 Definitions! 

3.3.2 Open Sets 



11 



Contents 



13.3.3 Closed SetsI 78 



3.4 Compact Sets 81 



4 Limits of Functions and Continuity] 87 

14.1 Limits of Functionsl 87 

4.2 Continuity] 93 

96 



4.3 Operations on Continuous Functions; Examples 



4.4 Continuous Functions on Compact Sets] 101 



DifFerentiable Functionsl 
15.1 Differentiation in Ml . . 



105 

105 

15.2 Mean Value Theoremsl 110 



A Logic, Sets and Functions 



119 

A.l Logic and Formal Language] 119 



A. 1.1 Symbols denoting Objects, Operations and Relations 120 



A. 1.2 Logical Connectives 121 



A.l. 3 Quantifiers 124 



A. 2 Sets, Functions and Relations 126 



A. 2.1 Operations on sets 129 



IA.2.2 Functionsl 135 

A. 2. 3 Functions Operating On Sets 141 

IA.3 Countable and Uncountable SetsI 143 



Chapter 1 



An Axiomatic Development of the 
Real Number System 

Traditionally, a rigorous course in Analysis progresses (more or less) in the 
following order: 

sets, limits, 

mappings, => continuous derivatives integration 

functions 

On the other hand, the historical development of these subjects occurred in 
reverse order: 

Cantor 1875, Cauchy 1821, Newton 1665 Archimedes 

Dedekind <= Weierstrass <= Leibniz 1675 <= Kepler 161 5 

Fermat 1638 

— Prom Analysis by Its History, by E. Hairer and G. Wanner, Springer 1996 

1.1 Why we need axioms 

Consider the following questions: 

Question 1: Many years ago, you were taught the following algorithm for multiplying two 
numbers: 

23 

17 
161 
230 
391 

Why does this algorithm work? 

Question 2: Why is — 1 x — 1 = 1? Alternatively, why is the product of two negative numbers 
a positive number? 



1 



2 



Why we need axioms 



If you think that these are silly questions, think again. The answers to these questions are 
not obvious. You are merely so used to the answers that the questions never occur to you. 

An explanation for why the multiplication algorithm works might go along the following 
lines: 

23 X 17 = 23 • (7 + 10) 

= 23 -7 + 23 -10 

= (20 + 3) -7 + (20 + 3) -10 

= [20 • 7 + 3 • 7] + [20 • 10 + 3 • 10] 

= [140 + 21] + [200 + 30] = 161 + 230 

= 391 

To do this calculation, wc performed the following operations: 

(i) We used the fact that a-{b + c) = a- b + a- c several times. 

(ii) We retrieve certain results, like 3 • 7 = 21, from memory. Such results were learnt by 
rote, in the form of multiplication tables. Thus all the values of a x 6 for 1 < a, 6 < 10 
are stored in a mental look-up table. 

The values in the look up table were determined empirically, i.e. by observation. To 
see that 7x8 = 56, take 8 small bags, each containing 7 stones, and empty them into a 
big bag. If you now count the number of stones in the big bag, you will get 56. That's 
just a fact that's been observed over and over again, in many different places and at 
many different times. 

(iii) We use the fact that multiplying a number by 10 is accomplished by adding a zero to 
the end of that number. Thus 20 • 10 = 200. 

(iv) To calculate the value of a term such as 20 • 7 (which is not in the mental look-up table) , 
we have to argue that 20 • 7 = 7 • 20 = 7 • (2 • 10) = (7 • 2) • 10 = 14 • 10 = 140. Thus, in 
addition to the look-up table and the multiply-by-ten rule, we also used the following 
facts about multiplication: a ■ b = b ■ a, and a ■ (b ■ c) = (a ■ b) ■ c. 

(v) We used another algorithm (also learnt long ago) for adding numbers, such as 161 + 230. 
Try and justify that algorithm yourself. 

As you can see, in order to explain why the multiplication algorithm works, you need to 
invoke quite a few simpler results about addition and multiplication. Question 1 is not as 
obvious as it looks! As for Question 2, you should be able to explain why — Ix— l = lby 
the end of this chapter. 

Now note the following (empirically verifiable) facts: Human beings have a certain intu- 
ition (or idea) about non-physical objects called numbers. These numbers can be combined 
in various ways to form new numbers, e.g. they can be added and multiplied. Moreover, 
there are some simple rules which govern the combination of numbers, e.g. 

(i) The product of two numbers does not depend on where or when the multiplication is 
performed. 

(ii) a + b = b + a ab = ba 

(iii) a + {b + c) = {a + b) + c a{bc) = {ab)c 



An Axiomatic Development of the Real Number System 



3 



(iv) a{b + c) = ab + ac 

et cetera. Our aim is now to find a set of rules, or axioms, which completely captures our 
intuition about the arithmetic of the reals. In other words, we seek a set of rules such that 

(1) The rules are in accord with our intuition about arithmetic. 

(2) The set of rules is sufficiently rich that any informal, intuitive arithmetic argument can 
be made formal: we can reach the same conclusion by applying no intuition at all, but 
just the formal rules (axioms). 

Why do we need axioms? For several reasons. 

• Axioms tend to be simple, and most people will accept them as in agreement with their 
intuition. Thus the axioms are a common starting point for all people. People who 
disagree about the axioms are probably talking about different things. 

• The agreed-upon rules can be applied over and over again, to arbitrary levels of com- 
plexity. Any two people who agree on the (simple) axioms must also agree on the 
(complicated) conclusions that may be reached by formal application of those axioms. 

On the other hand, intuition becomes less and less reliable as we increase the level of 
complexity, and thus conclusions obtained solely by a intuition are more suspect. 
For example, you and I may agree that Euclid's 5 axioms for geometry are in accordance with our 
intuition of space. These axioms arc simple, and difficult to disbelieve. You may have a powerful 
intuition, however: You intuit that the square of the (length of) the hypotenuse of a right-angled 
triangle is equal to the sum of the squares of the other two sides. But my intuition is far less developed 
than yours: I just don't see it, and so I don't believe you. Should you provide a step-by-step argument, 
starting from our common ground (the 5 axioms), using only commonly agreed rules, I will be forced 
to admit that your intuition is correct. In this way, I can verify the truth of your assertion myself, and 
don't just have to take your word for it. 

• If we use the axiomatic method, we are constantly aware of our assumptions. It therefore 
becomes much simpler to discern similarities and differences between various mathemat- 
ical objects and operations. This will make the arguments portable (in the Computer 
Science sense — arguments (computer code) can easily be moved from one problem to 
(platform) to another). 

• Finally, axioms allow us to circumvent metaphysical speculation about the nature and 
existence of mathematical objects. What, for example is a real number? Is it an irre- 
ducible, or is it made up of simpler things? This question was first given a satisfactory 
answer in 1872. Indeed, it was given two different but satisfactory answers in that year, 
by Dedekind and Cantor. In each case, the real numbers are "constructed" from some 
previously constructed, simpler, objects, e.g. the rational numbers. 

Thus there is no single answer to the question: "What is a real number?" . But the exact 
nature of the reals is unimportant for mathematical purposes. What is important is how 
they behave, i.e. how they can be recombined, using various operations, to form new 
numbers. The axioms are essentially just a description of such behaviour, and though 
the three constructions disagree about the essential nature of the reals, they do agree 
on how they behave. 



4 



Why we need axioms 



Example 1.1.1 Series Eire not the same as sums In high school you learnt that 



A typical argument to prove this fact, which you probably used a number of times in your 
first year calculus course, might go as follows: 

Let £• := 1 + ^ + ^ + ^ + . . . . Then 



8 



S = 2S-S = 2(1 + 1 + 1 + 1 + + l + l 

= {2 + i + i + \ + i + ...)-{i + i + \ + i + ...: 

= 2 + [(l-l) + (i-i) + (i-i) + (|-|) + ...] 
= 2 



Consider now the series 



It can be shown — and we will do this later — that the above series converges to ^ := In 2. 
If we rearrange the terms according to the recipe "One odd, then two evens" we obtain 



1 1 J_ 

2 6 10 

i.e. In 2 = ^ In 2 — a contradiction. This example shows that a basic intuition fails: one 
cannot simply interchange and rearrange the terms in an infinite series and expect to get 

the same answer, though this is perfectly acceptable for finite sums. 

At this point, you should be feeling rather uncomfortable about the series S := I + ^ + 
I + I + . . . — the argument that S = 2 also seems to use some rearrangements. We need 
to examine the notion of convergence of an infinite series rather more closely. 



An Axiomatic Development of the Real Number System 



5 



Example 1.1.2 A Useful Monster: Brownian Motion 

We consider here an example of a counter-intuitive "monster" which is used extensively in engineering and 
control theory, as well as economics and finance, to describe random phenomena. 

Although a precise definition requires some knowledge of elementary statistics, a Brownian motion (also 
called Wiener process) is a random function B : [0, oo) — M defined, roughly, as follows: 

(i) B{t) is a continuous function with 5(0) = 0. The variable t is thought of as time. 

(ii) Each change B{t + s) — B{s) is a normally distributed random variable, with mean zero and variance 
equal to the length of the elpapsed time t + s — s = t. 

(iii) Changes on non-overlapping time intervals are independent of each other. 
A graph of a path of a Brownian motion is shown below: 



0.15 n 




-0.2 J 



Using some rather difficult analysis, one can then prove that Brownian motion has the following properties: 

• Though everywhere continuous, each path is nowhere differentiable, i.e. ^ := lini ^(^+^y~^(-^) does 
not exist for any t > 0. This means that it has no "smooth" bits anywhere. 

• The length each path over any finite non-zero interval is infinite. In particular, the path shown in the 
graph has infinite length over the interval [0, 1] (and also over the interval [0, io~^oooooooOjj 

Until the mid-nineteenth century, it was widely believed that any continuous function must be differentiable 
at least somewhere, but Brownian motion provides a counterexample. Though it was used by Louis Bachclier 
in 1900 to determine the prices of stock options, and by Albert Einstein in 1905 as evidence for the existence 
of atoms, the mathematical object "Brownian motion" was shown to exist only in 1923, by Norbert Wiener. 
For such monsters, intuition is useless, and only careful analysis can provide insight. 



6 



Fields and Arithmetic 



1.2 Arithmetic of Fields 

The aim of this chapter is to present an axiomatization of the real number system. We shall 
do this in three stages: 

(1) First we shall discuss the purely arithmetic properties of the reals. The reals form an 
algebraic system called a field. Intuitively, a field is a set in which addition, subtraction, 
multiplication and division are possible, and obey the usual rules. 

(2) Next, we shall discuss the properties of a field equipped with an ordering relation <. 

(3) Finally, we shall add an axiom, the Completeness Axiom, which ensures that it is possible 
to take limits. 

The aim is to write down a set of axioms which completely characterize the system of real 
numbers. 

Remarks 1.2.1 Throughout this section, I refer to the reals as though you already know what what they 
are, and how they behave (which, of course, to a large extent, you do). The more philosophically minded may 
therefore come to believe that much of the following discussion is circular. I define the properties of the reals 
by observing the properties of the reals. 

That is not the case. We will operate on two levels, the intuitive and the formal: We have an intuitive idea 
of real numbers, to which I make frequent appeal. For example, wc may think of real numbers as points on 
a line, or as objects that have a representation of the form 3.14159265. . . (i.e. a decimal representation), to 
name just two intuitive ideas of number. We then extract the basic properties (axioms) of the real number 
system from our intuitive notions, leading from the informal level to formal. All theorems, propositions, etc. 
will be proved from these formal properties. Thus while we do use intuitive notions to write down the basic 
axioms, we do not use intuitive notions to prove theorems. For that, we use only the basic axioms. 

□ 



Definition 1.2.2 A field is a tuple {F, +, •, — , ^, 0, 1), satisfying all the properties below: 


• 


F is a set; 


• 


+, • are binary operations on F. 




This means that +, • are functions of two variables on F: 




+,■ : F X F F 




It is customary to write a + b instead of +(a, 6), and a ■ b or ab instead of -(o, b). 


• 


0, 1 are distinct designated members of F. 




Such elements are also called nullary operations on F, or constants. We call the zero 




element, or additive identity. Similarly, we call 1 the unit element, or the multiplicative 




identity. 


• 


— are unary operations on F. 




Thus — are functions from F to F. Ifowever, ~^ is a partial function: is 




defined if and only if a ^ 0. 



An Axiomatic Development of the Real Number System 



7 



In addition, the operations are required to satisfy the following basic properties (axioms) : 



(C+) 


a + b = 


b + a 


(Commutativity of addition) 


(A+) 


(a + 6) + c = 


a+{b + c) 


(Associativity of addition) 


(Id+) 


a + = a 


= + a 


(Additive identity) 


(Inv+) 


a + (-a) = 


= (-a) + a 


(Additive inverse) 


(C-) 


a ■ b = 


b ■ a 


(Commutativity of multiplication) 


(A-) 


{a ■ b) • c = 


a ■ {b ■ c) 


(Associativity of multiplication) 


(Id) 


a ■ 1 = a 


= 1 • a 


(Multiplicative identity) 


(Inv-) 


a ■ (a~^) = 1 = (a~ 


^) • a when a 7^ 


(Multiplicative inverse) 


(D) 


a - {b + c) = 


a ■ b + a - c 


(Distributivity of • over +) 



Remarks 1.2.3 We introduce here some basic (and familiar) rules designed to simplify no- 
tation: Firstly, we will assume that the operations satisfy the usual order of precedence: ~^ 
before • before — before +. Thus ab~^ = a ■ {b"^) (and not {ab)~^), ab + c = (ab) + c (and 
not a{b + c)), etc. 

Next, we define the expression "6 — a" to mean "6+ (—a)", and we define "o/fe" to mean 

By (A+), {a + b) + c = a + {b + c). We may therefore omit the brackets and denote this 
common value hy a + b + c. Similarly, "abc" is defined to be the common value of {ab)c and 
a{bc). 

We will also write "a^ft" instead of "aa6", "a-^" instead of "(a-i)^", etc. 

□ 

Are the above axioms sufficient to characterize the system of real numbers? No. As is 
demonstrated by the next example, there are many different examples of fields, and each field 
satisfies the axioms (C+),(A+),. . . (D). 

Examples 1.2.4 (1.) The set Q of rational numbers with the usual operations is a field. 

(2.) The set M of real numbers with the usual operations is a field. 

(3.) The set Z of integers with the usual operations is not a field. Why not? 

(4.) The set C of complex numbers with the usual operations is a field. 

(5.) Let F = {a,b}. Define the operations +, • on F as follows: 



+ 


a 


b 




a 


b 


a 


a 


b 


a 


a 


a 


b 


b 


a 


b 


a 


b 



It is easy to verify that +, • are commutative and associative. 
For example, a + b = b = b + a, a + {a + b) = a + b = b = a + b={a + a) + b. 

We see that a behaves like an additive identity, in that a + x = x ior all x e F. 
Furthermore, b behaves like a multiplicative identity, in that b-x = x for all x ^ F. Let's 
therefore make a the designated clement 0, and make b the designated element 1, so that 
F = {0, 1}. Now our tables look like: 



8 



Fields and Arithmetic 



+ 





1 







1 








1 











1 


1 





1 





1 



To make F into a field, we must also see if we can define two unary operations — and 
Since 1 + 1 = and + = 0, we define — : F ^ F by: —1 = 1, —0 = 0. The operation 
~ needs only be defined for the element 1. As 1 • 1 = 1, we define 1-^ = 1. Thus 



X 


— X 








1 


1 



1 



1 



It is easy to see that the addition and multiplication defined on F are just ordinary 
division and multiplication, modulo 2. 
The field F goes by the name Z2. 

Next, we prove some basic results about arithmetic inside a field. Most of these look 
obvious, but that's only because they are already so familiar. To formally prove these results, 
we are allowed to use only the field axioms. Consequently, these results will be true in any 
field — not just the familiar ones, such as Q,M,C, but also the as yet unfamiliar ones, such 
as Zp {p a prime) . 

Proposition 1.2.5 The axioms C"^, A"*", Id"*", Inv^ imply the following statements: 

(a) Ifx-\-y = x + z, then y = z; (cancellation) 

(h) If x + y = x, then y = 0; (uniqueness of additive identity) 

(c) If X + y = 0, then y = —x; (uniqueness of additive inverse) 

(d) -{-x) = x; 



Proof: (a) Suppose that x + y = x + z. Then 

y = + y Id+ 

= {—X + x) + y Inv"*" 

= -X + {x + y) A+ 

= —X + (x + z) assumption 

= {-x + x) + z A+ 

= + 2; Inv+ 

= z Id+ 

(b) If X + y = X, then x + y = x + 0, so the result follows from (a). 

(c) If X + y = 0, then 

y = + y 

= (-X + x) + y 
= — X + (x + y) 
= -x + 



= —X 



An Axiomatic Development of the Real Number System 



9 



(d) —X + a; = 0, so by (c), we must have x = — (— x). 



Proposition 1.2.6 The axioms C', A', Id', Inv' imply the following statements: 

(a) If xy = xz, and x ^0, then y = z; (cancellation) 

(b) If xy = X, and x ^ 0, then y = 1; (uniqueness of multiplicative identity) 

(c) If xy = 1, and x ^ 0, then y = x^^; (uniqueness of multiplicative inverse) 

(d) Ifxj^O, then (x'^)"^ = x; 



Exercise 1.2.7 Prove the preceding proposition. 



□ 



Note that the distributive law (D) was not used in proving the above statements. If we invoke 
the distributive law, we can prove more: 



Proposition 1.2.8 The field axioms imply the following statements: 

(a) 0-x = 0; 

(b) If X ^ 0, y ^ 0, then xy ^ 0; thus if xy = 0, then either x = or y = 0; 

(c) {-x)y = -xy = x{-y); 

(d) {-x){-y) = xy; 

Exercise 1.2.9 Prove the preceding proposition. Here are some hints: 

(a) Justify the following string of equalities: x + 0- x = l- x + 0- x = {l + 0)-x = l- x = x. 
Now use Proposition 1.2.5[ b) . 



(b) If x,y are non-zero, then (x ^y ^){xy) = 1. Hence, by (a), xy / 0. (Why can't we have 
= 1?) 

(c) {—x)y + xy = {—X + x)y = 0, so {—x)y = —{xy), by Proposition 1.2.5[ c). 

(d) Apply (c) twice, and invoke Proposition 1.2.5[ d). 



□ 

In this section, we concentrated on the arithmetic of fields. We saw that there are many 
different of fields, some finite (e.g. Z2) and some infinite (e.g. C). The field axioms are 
therefore insufficiently strong to characterize the system of real numbers. In the next section, 
we go some way towards remedying this situation, by adding more axioms. 

1.3 Ordered Fields 

In addition to basic arithmetic, intuition about real numbers also contains the notion of order, 
i.e. we consider some real numbers to be less than others. This notion does not make sense 
in all fields. For example, the field C of complex numbers has no natural ordering. (Is 2 + 3i 



10 



Ordered Fields 



greater or less than 3 + 2i? — the question does not make sense.) So the notion of order is 
something extra, something outside the arithmetic of fields. We must therefore write down a 
set of axioms for the behaviour of the order relation on the reals. 

Recall that a partial ordering < on a set is a binary relation satisfying the following axioms: 

(PO-R) s<s 

(PO-A) s <t and t < s imply s = t 
(PO-T) s < t and t < u implies s < u 

An additional axiom, (TO), strengthens the notion of partial ordering to that of a total 
ordering: 

(TO) Either s < t or i < s (or both) 

A remark on notation: We write "s < f ' instead of "s < t and s ^ t" . Similarly "s > t" 
is just another way of saying "t < s" . An analogous statement holds for "s > t" . 

An ordered field is a field which is also a totally ordered set, subject to two additional 
conditions: 

Definition 1.3.1 An ordered field is a tuple {F, +, ■, — , 0, 1, <) satisfying the field ax- 
ioms (C+), (A+), (Id+), (Inv+), (C ), (A ), (Id'), (Inv ), (D), and the order axioms (PO-R), 
(PO-A), (PO-T), (TO), such that, in addition 

(0F+) VxVyVz[a; <y^x + z<y + z] 
(OF) VxVy[x >OAy>0^a;y>0] 

The fields Q and M are ordered fields (with the usual ordering). However, 

• No finite field is an ordered field. 

• The field C cannot be made into an ordered field. 

You will be required to prove these facts soon. 

Definition 1.3.2 If a; > 0, we say that x is positive; if x < 0, wc say that x is negative. 
We say that x, y have opposite signs if one of x, y is positive and the other negative. 

The following proposition contains some familiar results on the interaction between order 
and arithmetic. Once again, we stress that these will hold true in any ordered field (not just 
the reals) , as only the ordered field axioms will be used in the proof. 

Proposition 1.3.3 Let {F, +, •, — , 0, 1, <) be an ordered field. 

(a) X < y if and only if y — x > 0; 

(h) If X ^ 0, then x and —x have opposite signs. 

(c) If X > and y < z, then xy < xz; if x < and y < z, then xy > xz; 

(d) If X / 0, then x^ > 0; 

(e) 1 > 0; 

(f) If X ^ 0,y ^ 0, then x,y have opposite signs if and only if xy < 0. 

(g) X > implies x~^ > 0; x < implies x~^ < 0; 

(h) If < X < y, then < y'^ < x~^ ; 



An Axiomatic Development of the Real Number System 



11 



Proof: {a) X < y implies x + (— x) < y + {—x) by (OF"'"), so < y — x. Similarly < y — x 
implies x + 0<x + {y — x). 

(b) If X < 0, then x + (— x) < + (— x), so < — x. A similar proof works for the case x > 0. 

(c) X > and z — y > implies x{z — y) > by (OF'). A similar proof works for the other 
case. 

(d) This is clear if x > 0. Else, -x > (why?), and so (-x) • (-x) > 0, by (OF'). But 
(— x) • (— x) = x^, by Proposition 1.2.8 d). 

(e) By (d); 

(f) If X > 0,y < 0, then xy < 0, by (c). Conversely, suppose that xy < 0. Then x ^ and 
y 7^ 0. Now if x,y are both positive, then xy is positive, by (OF'); if x,y are both negative, 
then —X, —y are both positive (by (b)), so that (— x)(— y) = xy is positive, by (OF'). 

(g) Since xx^"*^ = 1 > 0, x and x^^ cannot have opposite signs. 

(h) Suppose that < x < y. Then x^^,y^^ > 0, so < x • x~^ < yx~^, by (OF'). Hence 
1 < yx~^, and so < y^^{yx^^). 



Exercise 1.3.4 (a) Suppose that {F, +,-,—, -^,0,1) is a field, and that P C F has the 
following properties: 

(i) For each x G F, exactly one of the following is true: 

x = or X £ P or — x € P 

(ii) x,y £ P implies xy £ P 

(iii) x,y £ P implies x + y £ P; 

Define a binary relation < on F by 

x<y<^x = yoTy — x£P 

Show that < makes F into an ordered field. Also show that P is precisely the set of 
positive elements. 

(b) Prove that there are no finite ordered fields. 
[Hint: Show that 

0<1<1 + 1<1 + 1 + 1<1 + 1 + 1 + 1<... 

] 

(c) Show that C cannot be made into an ordered field — it is impossible to define a total 
ordering < on C so that (OF^) and (OF') are satisfied. 



□ 



12 



The Continuum 



1.4 The Continuum 

Let's take stock for a moment: We are trying to find a complete set of axioms for the real 
numbers, i.c we are attempting find a set of rules which completely capture our intuition 
about the behaviour of the reals. Our intuition involves both arithmetic and order-theoretic 
properties, and, so far, we've written down 15 axioms, the axioms of an ordered field: (C"'"), 
(A+), (Id+), (Inv+), (C-), (A-), (Id-), (InV), (D),(PO-R), (PO-A), (PO-T), (TO), (0F+), and 
(OF ). Using these axioms, and nothing else, we've managed to prove a number of interesting 
properties: {—x){—y) = xy; squares (x^) are non-negative, etc. These properties hold in any 
ordered field. 

So do the ordered field axioms completely capture our intuition about the behaviour of 
the reals? No. The set Q of rational numbers also forms an ordered field, and we know that 
Q 7^ R (e.g. there are numbers, such as \/2, which are not rational). Thus wc have some 
additional intuition which allows us to distinguish the set of reals from the set of rationals. 
What could it be? 

Geometry comes into play. We have an additional intuition about non-negative real 
numbers as being lengths of straight line segments: We can measure the length of a line 
segment using a ruler, and the length will be a real number. In this way, we come to regard 
the set of real numbers as points on a straight line which extends indefinitely in both directions. 

Example 1.4.1 (1) Consider an isosceles right-angled triangle, with right-angle sides both 1 
unit in length. Our intuition dictates that the hypotenuse have a length. By Pythagoras' 
Theorem, the length of the hypotenuse is a number x satisfying = 2,x > 0. 

(2) Consider the graph of the parabola f{x) = — 2 in the Cartesian plane. We see that 
/(O) < 0, and that /(2) > 0. Our intuition dictates that the graph cut the x-axis 
somewhere between and 2. It is cut at an x satisfying x'^ = 2,x>0. 

□ 

Of course, you say, in both examples we are seeking the number x = \/2- However, the symbol 
^ is not (yet) part of our language. The existence of roots is something genuinely new, as we 
shall show in the next section. 

The following proposition should be familiar to you: 

Proposition 1.4.2 There is no rational number x G Q such that x"^ = 2. 

Proof: The proof is by contradiction. Suppose that x^ = 2 and that x = ^. Choose mo to 
be the least positive integer such that there is an integer no for which x = Clearly 

2mQ = Hq 

so that no is even, i.e. no = 2Zo for some integer Iq. Then 

ml = 2ll 

so mo is even, i.e. mo = 2feo for some positive integer /cq. It follows that x = ^, in 
contradiction to the choice of mo (since < mo). 

H 



An Axiomatic Development of the Real Number System 



13 



So \/2, if it exists, is irrational. 

Exercise 1.4.3 (a) Let f{x) = anx"' + an-ix"'~^ + - ■ ■ + aix + ao be an n^'^-degree polynomial 
with integer coefficients oq, . . . , a^, where / 0. Suppose that ^ g Q is a root of /(x), 
where p, g S Z are relatively prim^ Show that p is a factor of ao and that g is a factor 
of o„. 

(b) Consider the special case where a„ = 1. Show that all rational roots of f{x) must be 
integers. 

(c) Conclude that the following numbers, if they exist, are irrational: \'^,\^12, (5 — V^)^ . 

□ 

Here's a brief summary of the salient points of this section: 

• We have provided a set of axioms that capture our intuition about the arithmetic and 
order-theoretic properties of real numbers. 

• We also have a geometric intuition about non-negative real numbers as being the lengths 
of line segments. Applying the Theorem of Pythagoras to the hypotenuse of an isosceles 
right-angled triangle, we are lead to believe that there exists a non-negative real number 
x with the property that = 2. 

• However, we proved that x, if it exists, cannot be a rational number. 

• Because the field Q of rational numbers satisfies all the axioms proposed, it follows that 
the existence of x cannot be proved from those axioms alone. 

• Thus the set of axioms proposed so far is inadequate: It does not completely capture 
our intuition about the real number system, because there are "truths" that we cannot 
prove. 

• It is therefore necessary to find at least one more axiom. 

From where will we get such an axiom? The intuition that leads us to believe in the existence 
of the number \/2 has its origins in the geometric concept of length. Thus far, we have not 
considered any geometric axioms at all. We would like to keep things as simple as possible, and 
avoid an axiomatization that depends too much on geometric concepts. Is it really necessary 
to incorporate, say, Euclid's axioms of plane geometry, to formalize the notion of the length 
of a line segment and the notion of a right-angled triangle, all in order to prove Pythagoras' 
Theorem, so that we can finally obtain the existence of \/2? 

Fortunately not. Perform the following thought experiment, an exercise in visualization: 
Let be a natural number, and consider an arbitrary, but non-empty, set C = {Li : i £ 1} 
of line segments, each of length < A^. Align them, and stack them together on top of each 
other, so that you see just a single line segment L. Clearly the following hold: 

(i) The length of the stack, L, is greater than or equal to the length of each Li G C; 

^Two integers are relatively prime if their greatest common divisor is 1. In this case, this means that the 
fraction - cannot be simplified any further. 



14 



The Completeness Axiom 



(ii) The length of L is finite, being less than or equal to A'^. 

(iii) Any line segment which is strictly shorter than L is also shorter than some Lj G C. 

(iv) Thus L is the shortest line segment which has length greater than or equal to the length 
of each Li £ C. 

Note that L may be strictly longer than each of the Lj. For example, if the index set is the 
set of natural numbers (i.e. / = N) and Li = 1 — 2~*, then each Li < 1. But the length 
of the stack is L = 1. Thus the length L need not be a member of the given set of lengths 
{Lj : i E /}. It is something new. 

We now rephrase and generalize the above intuition as follows: 

If A = {ai : i & 1} is a non-empty set of real numbers which 
is bounded above, then it has a least upper bound, i.e. there 
exists a number a such that a > ai for all ai £ A, and if 
a' < a, then there is Ui £ A such that also a' < ai. 

It turns out that with this final axiom we have completely characterized the set of real 
numbers. 

1.5 The Completeness Axiom 

We begin with some definitions: 

Definition 1.5.1 Let (P, <) be a total ordering and let A O P. 

(a) We say that an element n G P is an upper bound for A if and only if 

Va G A{a < u) 

(b) Similarly, we say that ^ G P is an lower bound for A if and only if 

Va G A{1 < a) 

(c) We say that A is bounded if and only if it has both an upper bound and a lower bound. 

(d) We say that uq is the supremum, or least upper bound of A if and only if the following 
hold: 

(i) Mo is an upper bound of A; 

(ii) If u is any upper bound of A, then uq < u. 

We write 

no = sup^ or uq = l.u.b.(A) 

(e) We say that Iq is the infimum, or greatest lower bound of A if and only if the following 
hold: 

^i.e. (P, <) satisfies (PO-R), (PO-A), (PO-T) and (TO). 



An Axiomatic Development of the Real Number System 



15 



(i) lo is a lower bound of A; 

(ii) If I is any lower bound of A, then I <Iq. 



We write 



Zo = inf A or Zq = g-l-b.(A) 



(f) We say that uq is the maximum of A, and write txo = max A, if and only if 



uq ^ A and uq = sup A 



(g) Similarly, we say that Iq is the minimum of ^, denoted lo = minvl, if and only if 



(b) A set A C P is called an interval (in P) if and only if whenever a, 6 G A with a < b, 
then [a, b] C A. 



Examples 1.5.3 (1) If a < 6 in a total ordering (-P, <), then [a,b], {a,b), {a,b],[a,b) are 
intervals. 

(2) A subset A of a total ordering (P, <) is bounded if and only if there exist a,b e P such 
that A C [a,b]. 

(3) In (M,<), we have: 

(a) 1 = niax[0, 1] = sup[0, 1]; = min[0, 1] = inf [0, 1] 

(b) 1 = sup[0, 1), but max[0, 1) does not exist. 

(c) U A = {x eR: x"^ < 2}, then sup A = V2. 

(d) If yl / 0, then inf A < sup^. 

(e) If A = 0, then A has no supremum and no infimum. First note that every ii G M is 
an upper bound for A — for if u is not an upper bound, then there must be an a G A 
such that u < a. However, A is empty, so no such a can be found. 

Similarly, every real number is also a lower bound for A. We therefore write 



Iq E A and Iq = inf A 



□ 



Definition 1.5.2 Let (P, <) be a total ordering. 



(a) Let a,b E P with a < b. We define the following sets: 



[a,b] 
(a, 6) 
{a,b] 
[a,b) 



{x £ P : a < X <b} 
{x e P : a <x <b} 
{x e P : a<x <b} 
{x e P : a<x <b} 



sup = — oo inf = +00 



16 



The Completeness Axiom 



(4) In (Q, <), the set ^ = {x G Q : < 2} has no supremum: If u G Q is an upper bound for 
A, then u is a rational number which is greater than ^/2. We can then choose a rational 
number u' such that \/2 < u' < u. Thus: given any upper bound u for A, we can find 
another upper bound u' < u. Hence A has no least upper bound (in Q). 

□ 



Definition 1.5.4 (Completeness) 

Let F be an ordered field. We say that F is complete if and only whenever a non-empty 
A O F has an upper bound, then it has a least upper bound, i.e. sup A exists (and belongs 
to F). 



Exercise 1.5.5 Show that if an ordered field F is complete, then any non-empty subset of 
F which is bounded below has a greatest lower bound. Thus: In a complete ordered field any 
non-empty bounded set has both a supremum and an infimum. 

□ 

Note that Q is not complete: The set ^ = {x G Q : x^ < 2} has an upper bound in Q, 
but no supremum. However, our intuition about real numbers as lengths of line segments led 
us to conclude, via an experiment in visualization, that the set of real numbers is complete. 
This is our final axiom for the system of real numbers. The Completeness Axiom allows us 
to distinguish between M and Q. 

We state the following result without proof: 



Theorem 1.5.6 There exists a complete ordered field (M, +, — ,~-^,0,l,<). Moreover, 
there is essential only one complete ordered field, in the sense that any two complete 
ordered fields are isomorphic. 



Remsirks 1.5.7 The notion of field isomorphism will be studied in detail in an algebra course, 
so we won't give a formal definition here. To say that two algebraic structures, such as two 
fields F and G, are isomorphic, means that they are essentially the same object. 

As an analogy, consider two copies of War and Peace. They are not equal: One is in my 
left hand, the other in my right hand; one is a hardback, the other a paperback, etc. But they 
are nevertheless the same book. Similarly, two mathematical structures can be the same, 
without being equal 

More formally, two fields F, G are isomorphic if there is a bijection ip : F ^ G which 
preserves all the operations. Thus F and G have the same structure, and it is only the names 
f, g of the field elements that are different. The element f oi F corresponds to the element ip{f) 
of G. Any relationship between elements /i, ...,/« of F will also hold for <p{fi), ■ ■ ■ , ^{fn) in 
G. In particular, if the relation /i +/2 = fs holds in F, then the relation + '^(/2) = ^{fs) 
holds in G, so that ^(/i + /a) = ip{h) + ^{fs). Similarly, ip{f-^) = ip{f)-\ ip{0) = 0, etc. 
(Note that the + in ip{fi + /2) refers to addition in F, whereas the + in (p{fi) + ip{f2) refers 
to addition in G. The same goes for the other operations and constants.) 



□ 



An Axiomatic Development of the Real Number System 



17 



Exercise 1.5.8 Consider the following two-player game, called FIFTEEN: Players take turns 
to pick a number from the set {1, 2, . . . , 9}. No number may be picked more than once (i.e. 
if Player I picks the number 3, that cannot be picked later by either Player I or Player II) — 
perhaps the simplest way would be to play would be with 10 cards, an ace (with value 1), 2, 
3, . . . , 9, with players alternating to picking a card . The object is to be the first player to 
possess three numbers (cards) whose values add up to 15. 

Convince yourself that this game is, in a sense, isomorphic to the game of noughts-and- 
crosses (also known as tic-tac-toe) . 



6 


7 


2 


1 


5 


9 


S 


3 


4 



Now astound friends and family with your mental skills by persuading them to play 
FIFTEEN, while you secretly play noughts-and-crosses. 

(The point is that, even when "things" are "the same", one representation of that "thing" 
may be vastly superior to another.) 

□ 

We have now accomplished our goal: We define the system of real numbers to be a 
complete ordered field. By the above theorem, there is essentially only one such structure. 
Thus, whenever we say "such and such is true in the set of real numbers" , we mean "such 
and such is true in a complete ordered field." 

RertiEirks 1.5.9 (a) So now we have defined the reals to be a complete ordered field. But, I 
hear you mutter, isn't a real number an object like 3.14159265. . . ? Not quite. 

3.14159265. . . (decimal) is a representation of a certain real number. The symbol tt is 
another, as is 11.001001000011111. .. (in binary). All of these represent the same real 
number, but they are not the same as that number. We will prove in the next chapter 
that every element of a complete ordered field has a decimal (or binary) representation. 



18 



The Completeness Axiom 



Note that we neatly side-step the following question: "What is a real number?" This 



question is not answered by the proof of Theorem 1.5.6 as there are several proofs, each 
giving a different construction of real numbers. 

The question is not important for the purpose of mathematics: It is the relationships 
between real numbers that count, not their "true nature" (whatever that might mean). 

(b) Note that the Completeness Axiom is of a very different nature to the other axioms. All the other axioms 
speak about elements of fields. For example, for every x,y,z £ F, the elements {x + y+z and x + {y + z) 
are equal (associativity of +); if x < y and y < z then x < z (transitivity of '), etc. In logical parlance, 
these are first order axioms. 

The Completeness axiom, on the other hand speaks about sets of field elements: For every A (Z F, if A 
is bounded above, then X has a supremum. In logical parlance, this is a second order axiom. 



□ 



So far, while operating at the intuitive level, we have always assumed that Q C M. But 
now we have formally defined M as a complete ordered field, we must ensure that this is true. 
The proof requires the notion of field isomorphism, and therefore properly belongs to algebra. 
We merely provide an intuitive outline: 

Proposition 1.5.10 The field Q of rational numbers is the smallest ordered field, in the 
following sense: Any ordered field contains (a copy of) Q as a subfield. In particular, 
Q C M. 

"Proof": Let F be an ordered field. We show that each rational number q £ <Q can be 
identified with an element G F If n G N, we identify it with the field element 

(n)F = (l + l + --- + l) 

n times G F 

Also identify the number zero with the zero field element, i.e. {0)f = 0. If n is a negative 
integer, we define {n)F = —{—n)F- {—n is a positive integer, so {—n)F has already been 
defined.). Next, if ^ G Q, we identify it with the i^-element 

(^)^ = (n)F("i)F^ for m / 

It is not hard to show that the map (/7:(Q— >F: ^'~^(^)Fis well-defined and injective, and 
that it preserves the relations between elements, i.e. that ip is an isomorphism between Q 
and the subfield {(^)f : m,n €z 7j,m ^ 0}. 

H 

Theorem 1.5.11 Let F be a complete ordered field. 

(a) F satisfies the archimedean property.- For any x > and any y in F there exists 
n G N such that nx > y. 

(b) The field Q of rationals is dense in F: whenever x < y in F there is q £ Q such that 
x < q < y. 



An Axiomatic Development of the Real Number System 



19 



Proof: (a) Let A = {nx : n € N}. Then A 7^ 0. If there is no n € N such that nx > y, then 
y is an upper bound for A. Thus the completeness axiom guarantees that oq = sup A exists 
(in F). Now ao — X < oo, as X > 0, so ao — X is not an upper bound of A. Hence there exists 
m G N such that — x < mx. Then ao < {m + l)x. But (m + l)x G A, and oq is an upper 
bound for A — contradiction. 



(b) Recall that any ordered field contains (a copy of) the field Q, by Proposition 1.5.10 We 
shall show that there exist m G Z, n G N such that (regarded as a member of i^), x < ^ < y- 
Now if X < y, then y — x > 0, and so the archimedean property allows us to find a non- 
negative integer n such that n(y — x) > 1. Similarly, there are non-negative integers mi, m2 
such that mi > nx,m2 > —nx — just consider the two cases x > 0, x < 0. It follows that 
— m2 < nx < mi, and so there is a smallest integer m such that nx < m. It follows that 
m — 1 < nx, and so 

nx < m < 1 + nx < ny 
which yields x < ^ < y. Division by n is possible, because n > 0. 



How did we get to the completeness axiom? Our intuition about real numbers as lengths 
led us to believe in the existence of a number x such that x^ = 2. We subsequently found that 
such an x could not be rational, and we concluded that we could not deduce the existence 
of X from the ordered field axioms alone. We then performed a experiment in visualization, 
stripped it of its geometric content, and wrote down the completeness axiom. 

So can we prove the existence of \/2 from just the ordered field axioms and the complete- 
ness axiom? Indeed, we can: 

Proposition 1.5.12 Let F be a complete ordered field, let n G N, and let x > in F. 
Then there exists a unique y F such that y > and y" = x. We denote this y by 
y = y/x. 



Exercise 1.5.13 We prove Proposition 1.5.12 



(a) First show that there is a unique such y, i.e. that if yi, y2 > are such that y" = x = y2, 
then yi = y2. 

(b) Define A = {i G F : t > 0, t" < x}. Show that x(x + 1)"^ G A. 

(c) Show that 1 + x is an upper bound of A. 

(d) Thus A is non-empty and bounded above. Let y = sup A. We shall show that y" = x. 
To do so we shall need the following inequality: For < a < b in F, we have 

6*^ - a" < (6 - a)n6"-i 

Use the identity b"- - a"- = {b - a){b'^~^ + ^"-^a H h ba"-'^ + a"~^) to prove that this 

inequality holds. 

(e) We apply this inequality twice — once to show that we cannot have y" > x, and once to 
show that we cannot have y" < x. 



20 



The Completeness Axiom 



Suppose > X. Define 

y"^ -X 

k = zr 

Note that < A; < ^ < y. Now show that iit = y — k then y'^ — t^ <y^ — x, and deduce 
that > X. 

(f) Thus t is an upper bound of A. Hence explain why assuming that y" > a; leads to 
contradiction. 

(g) Next suppose that y" < x. Explain why we may choose an h e F such that < ^ < 

(h) Now define s = y + h, and show that 

s^-y"" < hns^-^ < hn{y + 1)""^ < x - y" 

Conclude that < x. 

(i) Explain why assuming that y" < x leads to contradiction, 
(j) Conclude that y" = x, as required. 

□ 

Exercise 1.5.14 Let F be an ordered field a < b in F. We have seen that sets of the form 

[a, b], (a, b), (a, b], [a, b) are bounded intervals. 

(a) Show that not every bounded non-empty interval need be of this form. 

[Hint: Consider the set ^ = Q PI (0,\/2). This is a bounded interval in the field Q. 
However, there are no a, 6 in Q (!!) such that A = (a, 6) or ^ = (a, 6], etc.] 

(b) Show that if F is a complete ordered field, then every bounded non-empty interval must 
be of this form. 

□ 

Exercise 1.5.15 Let F be a complete ordered field, and let n G N. If x, y > in F, show 

that {xy)n = Xnyn. 

□ 

The following exercise proves an important property. 
Exercise 1.5.16 (Nested Interval Property) 

A family A = {Ai : ?' G J} be a family of subintervals of an ordered field is said to be nested if 
and only if it satisfies the following condition: Whenever i,j G /, either Ai C Aj or Aj C Ai. 

(a) Prove that R has the nested interval property: Any nested family of closed intervals has 

non empty intersection. 

(b) Show that the field Q does not have the nested interval property. 

[Hint: (a) Let Ai be a closed interval with endpoints and bi. Show that for all i,j G / we 
have Oi < bj. Conclude that {oj : i G 1} has an upper bound. 

(b) Choose rational ai < bi such that ai,bi converge to the same irrational number.] 

□ 



Chapter 2 

Sequences and Series 



In this chapter wc start doing proper analysis. We begin our attack on the notion of limit by 
discussing hmits of sequences. 

We define N to be the set {1, 2, 3, ... } (excluding 0). 

2.1 Introduction 

Intuitively, a sequence in a set X is a list 

of members of X. Now we can think of such a sequence as a function f which assigns to each 
n G N an element Xn of X. Thus /(I) = xi, /(2) = X2, etc. We turn this insight into a formal 
definition: 

Definition 2.1.1 A sequence in M is a function / : N ^ M. 

Wc will write {xn)n, or {xn : n G N), or {xn)^=i for the function / with the property that 
/(n) = Xn (for all n G N). 

Remarks 2.1.2 Often we may want to consider a sequence of the form xq,xi,X2, • • • , and 
sometimes even of the form x-2, xq, xi, .... Each of these can be thought of as a func- 
tion. The first sequence is a function with domain {0, 1, 2, ... }, and the second has domain 
{—2, —1,0, 1, . . . }. We will write these as (x„)^q and {xn)'^=^2 respectively. 

□ 

Examples 2.1.3 Here are some examples of sequences in M: 

(a) ((— 1)")„ is the sequence —1, 1, —1, 1 . . . , i.e. alternately —1 and 1. 

(b) C-)n = l,hl,.... 

(c) We may also define a sequence inductively (or recursively), e.g. if xi = X2 = 1 and 
Xn+2 = Xn + Xn+1 for n G N, then 

{Xn)n — 1) 1) 2, 3, 5, 8, . . . 

which is better known as the Fibonacci sequence. 



21 



22 



Introduction 



□ 

Exercises 2.1.4 1. Write out the first four terms of the following sequences: 

(a) (2")5f=o 

(b) {xn)n=-i^ where Xn = 

2. Given a real number x, let g{x) be the greatest integer which is < x. Given that x = 
3.14159265 . . . , write out the first nine terms of the sequence defined inductively by 

n-l 

xo = g{x) Xn = 5(lO''(x - lO^'^Xfc)) for n > 1 

3. Find a general formula for function of n, given that 

(a) Xi = 1, Xn+l = Xn + 2. 

(b) Xi = I, Xn+l = Xn + {n+l) 

□ 

You already have an intuitive understanding of the notion of convergence. For example, 

1 , 1 , 1 , 1 . . . converges to 
|, |, |, I . . . converges to 1 
3, 3.1, 3.14, 3.141, 3.1415 . . . converges to tt 

Intuitively, if we say that a sequence {xn)n converges to x, we mean that the elements of 
the sequence lie closer and closer to x. Still, it's not quite that straightforward; we must be 
careful: 

(i) The terms of 1, 1.2, 1.22, 1.222, ... lie closer and closer to 3: The 2°^^ term lies closer to 
3 than the I''*, and the 3'^'^ term lies closer to 3, than the 2"^^, etc. 
Nevertheless, the sequence does not converge to 3 (but to ^). 

You might say that the problem is that the terms never get to 3. True, but they don't 
ever quite reach ^ either. . . 

(n) The terms of the sequence 1, 1, |, 1, g, 1, j|, . . . do not lie ^''closer and closer'^ to 1: 
The term is closer to 1 than the 2'^'^, the 99*^ term is closer than the 100*^ (and also 
than the 10 billionth), etc. 

Thus some of the later terms lie further away from 1 than some of the earlier terms. 
Yet the sequence certainly converges to 1. 

It is therefore clear that, in defining the notion of convergence, we must be more precise than 
^\xn)n converges to x if and only if the terms Xn lie closer and closer to x" — this is too 
ambiguous to be useful. 

The next few sections arc therefore devoted to analyzing exactly what we mean when we 

say that a sequence converges. 

Exercises 2.1.5 You need not supply formal proofs for the following exercises, but do show 
any relevant calculations. 



Sequences and Series 



23 



Determine whether or not the following sequences converge, and if it converges, write down 
the limit. 



n(n+3) 
2n(2n+2) 

2" 



1,! 



(a) Xn 

(b) Xn 
(C) Xn = S + (-1)" 

(d) Xji = + 1 — ^Jn [Hint: Show that x„ = ^^^^^_^^ ] 

(e) Xn = Vn^ + n — n 

2. Can a sequence of rational numbers converge to an irrational number? Can a sequence of 
irrational numbers converge to a rational number? 

□ 



2.2 Definition of Convergence 
2.2.1 "Infinitely Often" and "Eventually" 

Here are two related ideas which will help elucidate the notion of convergence: 

Let P be a property that a real number may (or may not) have. We write P(x) if x has 
the property P. [For example, P could be the property of being positive, so that P(1.23), but 
-iP(— tt). Or Q could be the property of being irrational, in which case -iQ(1.23), whereas 
Q(— tt).] Now suppose that {xn)n is a sequence in M, and that P is a property: 

• We say has property P infinitely often iff there are infinitely many n for such that 
P{xn) is true. 

• We say {xn)n has property Peventually iff P(x„) is true for all n from some point 
onwards. 

These are intuitive descriptions, not formal definitions — we'll come to that. But first, some 
examples: 

Examples 2.2.1 (1) The sequence 1,2,3,4,5, ... is prime infinitely often: Infinitely many 
terms are prime numbers. It is also even infinitely often. It is eventually greater than 

(2) The sequence —2, —1, 0, 1, 2, 3, . . . is positive eventually: Prom the fourth term onwards, 
all terms are positive (i.e. > 0). 

□ 

Exercises 2.2.2 (1) Let Xn = sin^. Show that {xn)n is strictly positive infinitely often, 
strictly negative infinitely often, and zero infinitely often. 

(2) Let Xn = 2 + . Show that — 2| < eventually. 

(3) Let Xn = 2 + ^^=^ for n = 1,2,3, ... . Let P{x) be the property of "being > x". For 
which X does P{x) hold (i) eventually, (ii) infinitely often? 



24 



Definition of Convergence 



□ 



Remarks 2.2.3 • First note that {xn)n has property P eventually if and only if there 
exists an E N such that every term after the A^**^ has property P. Formally, 

(3N £ N) (Vn > N) [xn has property P] (f) 



We will take this to be a formal definition (cf. Definition 2.2 A] 



Thus {xn)n has property P eventually if and only if all hut finitely many terms have 
property P (i.e. at most finitely many terms have property -^P). 

Next note that if {xn)n has property P infinitely often, then the following is true: 

Given any natural number G N, there is a natural number 
n > N such that a;„ satisfies property P, i.e. 

(ViV G N) {3n > N) [xn has property P] (*) 

For if this is not the case, then there is some N such that no n > N has property P. 
Thus if Xn does have property P, then n < N. 

But then there are only finitely many Xn which have property P — at most those Xn 
for n = 1,2, . . . , A'' — l!! This contradicts the assumption that {xn)n has property P 
infinitely often. 

It follows that if {xn)n has property P infinitely often, then (*) is true. 

Conversely, if (*) holds, i.e. if given any A^ there is a later n > N such that x„ has 
property P, then there must be infinitely many elements which have property P: 

— Take A^ := 0: There must be an ni > such that Xm has property P. 

— Then take A^ := ni + 1: There must be 712 > ni such that Xn2 has property P. 

— Then take A^ := n2 + 1 there must be an 71,3 > n2. . . , etc. 

We thus obtain an infinite sequence ni < n2 < < ■ ■ ■ of natural numbers such that 
each Xni, has property P. 

It follows that if (*) is true, then {xn)n has property P infinitely often. 
Hence 

(*) is equivalent to ^\xn)n has property P infinitely often" 



We will therefore take (*) to be a formal definition (cf. Definition 2.2.4) 



Further note that if {xn)n has property P eventually, then it also has property P 
infinitely often. 

Finally, observe that infinitely often and eventually are closely related: If it is not the 
case that a sequence {xn)n has property P infinitely often, then eventually (x„,)n must 
have property -iP. 

-.(VA^)(3n > A^) [xn has property P] = (3A^)(Vn > A^) [x„ has property -.P] 



Sequences and Series 



25 



For example, if the sequence (a:„) is not positive infinitely often, then it has only finitely 
many positive terms. Thus from some point onwards, every term must be non-positive. 
Similarly, if a sequence does not have property P eventually iff it has property -iP 
infinitely often: 

-i(P eventually) = {-iP infinitely often) infinitely often) = {-iP eventually) 

□ 



Th(^ al")o\'(^ rcMnarks contain a hcI of formal dcrfmitious: 



Definition 2.2.4 (a) We say that a sequence (x„)„ 


has property P infinitely often if and 


only if 




(VAT G N) (3n eN)[n>N A 


Xn has property P] 


(b) We say that a sequence {xn)n has property P eventually if and only if 


{3N G N) (Vn eN)[n>N 


Xn has property P] 



2.2.2 Convergence to 

We first define what it means for a sequence of non-negative real numbers to converge 

to zero. Intuitively: 



To say that x„ — means that is "small" eventually, 
for any measure of "smallness" . 



The notion "small" is subjective, so we will demand that it holds for absolutely anybody's 
idea of "small". Specifically, suppose you define "small" by specifying some number e > 
and saying "A non-negative number x is small iff x < e" . To say that {xn)n is eventually 
small then means that from some point onwards all the small, i.e 

3N\^n>N [xn < e] 

This must be true no matter what gauge £ > of "smallness" you use. Thus: 



If {xn)n is a sequence 


of non-negative real numbers, we say 




Ve > Vn > AT [x„ < e] 


We also write 






lim Xn = 




n—^oo 



Thus — iff given any £ > it is possible to find a natural number N such that 

Xn < s whenever n> N 
The number N typically depends on e. The smaller e > 0, the greater A*" usually has to be. 



26 



Definition of Convergence 



Example 2.2.5 We show that if Xn ■= then hm„x„ = 0. 

Let £ > 0. We must show that eventually Xn < £, i.e. that there is A'^ € N such that 

Xn < s whenever n> N 

Now note that, by the properties of an ordered field, Xn < e iff n > e~^. Proceed therefore 
as follows: By the Archimedean Property of the real numbers — which follows from the 
Completeness Axiom — there is an AT G N such that > e^^. Note that this N depends 
on e: the smaller e, the greater A^ has to be, e.g. if e = we can take A^ = 101 (or any 
larger integer), whereas if e = we must take A^ to be at least 10 001. The point is that 
no matter how restrictive your definition of "small" is, there is a strategy for turning your e 
into an N: Simply take N to be any integer > £~^. 

Now a n> N, then the properties of an ordered field give 

Xn < Xn < e 

and thus we have shown that for any s there exists A/" G N such that 

Xn < £ whenever n> N 

which is what is required. 

□ 

Exercise 2.2.6 Let p > 0, and define x„ := ^ for n G N. It is intuitively clear that Xn — 0, 
but this needs proof. . . and here it is: 

(a) Let £ > 0. Explain why there is an A/" G N such that N > e~^l'P. 

(b) Now show that if n > A^, < £. 

(c) Explain why you have now shown that lim„ ^ = 0. 

Here, as in the previous example, we have presented an algorithm for determining an N from an e: N is any 
integer > e"^''^. 

□ 

2.2.3 Formal Definition of Convergence of Sequences 

It is now rather simple to define convergence of arbitrary sequences in M: To say that Xn^ x 
means that the distance between Xn and x converges to 0, i.e. 

Now the distance |x„ — x\ between Xn and x is non-negative, so wc already know what 
\xn — x| — > means from the previous subsection: It means V£ > 3N Vn > N[\xn — x\ < e]. 
Thus 



Sequences and Series 



27 



Definition 2.2.7 If {xn) n is a sequence of real numbers , we say 

Xn^x <s=^ Ve > 3Ar Vn > A/" [\xn - x\ < e] 
We then say that {xn)n is a convergent sequence, with limit x. We also write 

X = lim Xn for Xn ^ x 

n 

A sequence which is not convergent is said to be divergent. 



Thus — X iff given any e > it is possible to find a natural number TV such that 

\xn — x\ < £ whenever n > N 

The number N typically depends on e. The smaller e > 0, the greater A'' usually has to be. 

To reiterate: {xn)n x iS for any e > 0, all but finitely many terms lie within a distance 
of s of X. 

Remarks 2.2.8 Here is an alternative (topological) way of looking at the definition of convergence: Let 
U be an open interval containing x, so that x is not an endpoint of U. Then it is easy to see that there is a 
e > so that {x — e,x + e) C U. Now note that 

\xn — x\ < e X — e<Xn<x + e Xn € {x — e,x + e) ^ U 

We therefore sec that 



Xn —> X if and only if for any open interval U containing x: 

Xn & U eventually, i.e. at most finitely many x„ lie outside U. 

This characterization of convergence mentions only open intervals, and suggests how one can define convergence 
in spaces more general than R. 

□ 

Again we note that the choice of N will generally depend on e: The smaller e is, the 
greater A'^ must generally be. Another example will make this clear: 

Example 2.2.9 Consider the sequence given by x„ = By some algebraic manip- 

ulation, we see that 

_3n + l_3+i 
2n + 3 2+3 

n 

and since know ^ — ^ as n — ^ oo, we guess that lim^x^ = |. (Later, we will see that the 
above manipulations are perfectly acceptable. Right now, we don't know that yet, because 
we haven't proved it yet.) 

But having guessed the answer, we can prove that it is correct, from the definition of 
convergence. So prove that limnXn = §• We must show that, given any e > 0, we can find 
an A/" G N such that |x„ — || < e whenever n> N. 

A little algebra shows that \xn — || = 4^^- Suppose that we are given e = Then we 
must find an N such that < ^ whenever n> N. Now 

7 1 

< — iff n > 16 



An + 6 10 



28 



Definition of Convergence 



Thus \xn ~ ^\ <■ Jo whenever n > 17. (If n > 16, then n > 17, since n is an integer.) So if we 
take N = 17, then |xn — || < jq for all n> N. Of course, we can take any larger as well. 
If = 25, then also \xn ~ ^\ < whenever n > N (because n > 25 implies that n > 17). 

Does this prove that lim„Xn = |? No! We have merely shown that the requirements of 
the definition of limit can be fulfilled for the particular case e = What we have to do is 
to show that the requirements can be fulfilled for every e > 0. 

Exercise: Reasoning as above, show that if e = then we can take N = 174, and if 
e = we can take N = 1749. 

We have now found A^'s that fulfil the requirements of the definition of limit for the cases 
^ ~ lb' TOO' TM)' respectively. But, to prove that lim„ x„ = |, we must find such an N for 
every possible e > 0. There are infinitely many such e, so doing this case by case is impossible: 
We need a general argument. 

Let e > be arbitrary. We want to show that we can find an such that 



3| 



Clearly 



7 

< e for all n > A^ 



7 



4n + 6 



<e iff 



7 



i-6 



Choose A^ to be the least integer which is greater than ^4—. Such an A^ exists by the 
Archimedean Property of the reals. Then A^ fulfils the requirements of the definition of limit 
for e. Note that A" depends on e. 

Again, we have presented an algorithm (or recipe) for obtaining an A'' from an e: Take N to be any integer 
that is greater than . 

□ 



The following exercise will often be useful: 
Exercise 2.2.10 Properties of absolute value: 

(a) Triangle inequality: Show that + y| < |x| + \y\ for all x,y G M. (Hint: |x| = "v/a?.) 

(b) Show that | |x| — |y| | < |x — y| for all x,y G M. (Hint: |x| = \{x — y) + y\. Now use the 
triangle inequality.) 

□ 



Example 2.2.11 Suppose that we know that x„ ^ x. We show that then also x^ — > x^. 
First note that the set {xn : n G N} is bounded: There is a number K > such that |xn| < K 
for all n G N. For we may find A^i such that whenever n > Ni, we have |x„ — x| < 1 (definition 
of convergence with e = 1). Thus |x„| < |x| + 1 for ah n > A^i (by Exercise [2l2lol ;b)) 
Define 

K = max{|xi|, |x2|, . . • , \xni\, \x\ + 1} 

Note that if n G N, then |x„| < K: For if n < A"!, then certainly |x„| < K. On the other 
hand, if n > A'^i, then |xn| < |x| + 1 < as well. It follows that {xn '■ n G N} is bounded (by 
K). 



Sequences and Series 



29 



We are now ready to show that x"^ ^ x'^. Let e > 0. Put £ = Because x„ — ^ x, we can 
choose an A'' such that n > N imphes — x| < e. Then 

1^^^ cc I — ' ^Xfi I *^ ^Xfi X^ ' ^^Xf2,^ I I I ^ ^i^^Xfi X^ £ 

We have can thus find, for any e > 0, an N such that: whenever n > TV", we have |a;^ — < e. 

And therefore we have shown that — ^ when x„ — ^ x. 

The algorithm for determining A'^ from e works as follows: Wc arc given that Xn x, and thus wc have 
an algorithm, call it ALi for determining an from an e for the sequence {x„)n- Above, we apply ALi to 
e = 1 to obtain Ni. We then use Ni to find K. And then we apply ALi once again to e = e/2K to obtain the 
required N. 

□ 



Exercise 2.2.12 Comment on the following arguments: 
(a) Argument 1: 



Let Xn = (-1)", let e = 2, and let N = 1. If n > N, then \xn - 0| = 
1 < e. Hence \xn — 0| < e for all n> N , and so lim„ x„ = 



(b) Argument 2: 



Let Xn = (—I)". We show that {xn)n does not converge. We do this by 
contradiction: For suppose that x„ x. Let £ = \- Then we can find 
N such that n > N implies |.t„ — x\ < e. This means that \1 — x\ < g 
and I — 1 — x| = |1 + x| < ^. It follows that 

2 = |l-(-l)|<|l-x| + |x-(-l)|<^ + ^ 

i.e. 2 < 1, a contradiction. 

Hence there is no x such that Xn — > x. 



□ 



Example 2.2.13 Let Xn = j^I^^n+i ■ want to show that Xn 2. So let e > be given. 
We want to find an N such that 



2n^ + 6 
— n + 1 



2n + 4 

7 < ^ 

n'^ — n + 1 



Solving for n in terms of e will involve solving a cubic, which is tricky. However, we do not 
need to do this. Instead we obtain a simpler upper bound for , a bound of the form 

^ = for some positive constant C. Note that 2n + 4 < 3n for all n > 4. Also note that 
-n + 1 > for n > 1 (because 77,^ > 2ra - 2 for n > 1). Hence . < ^ = A for 

all n > 4. 

It therefore suffices to find A'^ > 4 so that ■% < £ whenever n > N (for if A < e, then 
^-Iso < e, provided n > 4.) Hence let be an integer which is > max{4, w |}. 



30 



Definition of Convergence 



Now that we have figured out how to determine an N from given e, we can now write 
down the 

FORMAL PROOF: Let e > 0. Choose an integer N > max{4, y^}. If n > AT, then 

n > -y^, and so ^ < e. Now since n > 4, we have 2n + 4 < 3n. Also — n + 1 > ^n^ for 
n> 1, and hence < ^ = < e when n> N. It follows that 



2n3 + 6 



n'^ — n + 1 



-2 



2n + 4 
— n + 1 



< £ whenever n> N 



Hence lim„ 1"'+^ = 2. 



□ 



Exercises 2.2.14 1. Find an N fulfilling the conditions for the given convergent sequence 

(xn) and the given e. 

(a) xn = ^, £ = 0.1. 

(b) Xn = £ = 0.001. (First prove that Xn is a decreasing sequence for n > 1 — i.e. that 
Xn+i ^ for all n > 1 — and then find N by "brute force".) 

(C) X„ = ^g±|^,£ = 10-6. 

□ 

Exercises 2.2.15 (1.) Show that if = (c, c, c, ...) is a constant sequence in M, then 

Xn ^ C. 

(2.) Show that if x„ — x in M, then |xn| — >■ 
(3.) Consider the following sequence (xn)n: 

0,1, 0,0, 1,0,0,0, 1,0, 0,0, 0,1, ...,10,0,^.., 0,1, 0,0, .^..,0,1,... 

n zeroes n+1 zeroes 

Does this sequence converge? Carefully explain your answer. 



□ 



We also sometimes say that a sequence converges to ±oo. 



To say that Xn — oo means (xn)„ is "large" eventually. 



The notion "large" is subjective, so we will demand that it holds for absolutely anybody's 
idea of "large". Specifically, suppose you define "large" by specifying some number K > 
and saying "^ number x is large iffx> K" . To say that (xn)n is eventually large then means 
that from some point onwards all the large, i.e 

3Nyn>N [xn > K] 

This must be true no matter what gauge AT > of "largeness" you use. Thus: 



Sequences and Series 



31 



Definition 2.2.16 If (a;„) n is a sequence of real numbers, we say 

Xn^oo ^ \/K>Q^N\/n>N[xn>K] 

We say that Xn — — oo iff —Xn — ^ +00. 

Remsirks 2.2.17 (a) Note that a sequence which "converges to 00" is nevertheless a diver- 
gent sequence: lim„a;„ does not exist (as a real number). Nevertheless, this divergence 
seems not to be too badly behaved. We therefore say that lim„ Xn exists in the extended 
sense. 

(b) We may also write lim„x„ = 00 instead of x„ — ^ 00, etc. 

(c) Do NOT confuse 00 with a real number. 00 is meaningless by itself, and we haven't 

"created" or discovered a new number (as arguably, we have done in the case of f = \/— 1). 
To say lim„ x„ = 00 is simply a short hand for "Given any number > 0, eventually 
Xn > K" . Note that infinity isn't mentioned in the definition of — > 00. 

(d) In particular, note that the limit theorems obtained in this and the previous section do 
not apply to 00. 

□ 

Example 2.2.18 Let Xn = s/n. We want to show that x„ — 00. So let if > be given. We 
must find an N such that Xn > K whenever n > N. It is easy to see that N = K"^ will do 
the trick. 

□ 

Exercise 2.2.19 Suppose that {xn)n is a sequence of strictly positive reals. Prove that 
lim„ x„ = 00 iff lim„ = 0. Why do we require that the x„ be strictly positive, rather 
than just non-zero? 

[Hint: (^) Given e > 0, choose N such that Xn> \ whenever n > N .\ 

□ 

We wrap up this section with two "obvious" facts: 
Proposition 2.2.20 A sequence can have at most one limit. 



Proof: Suppose that Xji > x and that x^ ~ 
Nx such that — x| < e whenever n > N^. 
n > Ny. Now let N = maxlN^, Ny}. Thus: 



• y, where x ^ y. Let < e < . First choose 
Then choose A^^ such that \xn — y\ < £ whenever 



If n > A'', then both |x„ — x\ < e and |x„ — y\ < e 

Hence 

\x — y\ < \x — xn\ + \xn — y| < 2£ < |x — 2/| 

Thus, assuming that a sequence has two distinct limits x, y, we have concluded that |a; — j/| < 
|x — y|, a contradiction. 



32 



Arithmetic, Order and Convergence 



Definition 2.2.21 A subset ^ C M is bounded if it is contained in a finite interval. 
Equivalently, A is bounded if and only if there is a number K such that |a| < -R' for all 
a G A (in which case A C [—K, K]). 



Note that Example 2.2.11| contains a useful proposition: 



Proposition 2.2.22 Any convergent sequence is bounded. 



Proof: Suppose that Xn — > x. Choose A^i such that whenever n > Ni, we have |x„ — x| < 1 
(definition of convergence with e = 1). Thus < |x| + 1 for all n > Ni (by Exercise 



Now define 

K = max{|xi|, |x2|, • . . , Ixatj |, |x| + 1} 

Note that if n G N, then \xn\ < K: For if n < A''i, then certainly < K. On the other 
hand, if n > A'^i, then \xn\ < |x| + 1 < X as well. It follows that {xn '■ n G N} is bounded by 
K. 



Exercise 2.2.23 Here are some variations on the definition of Hmit: These are not serious definitions, but 
merely intended to give you a feel for the role of the quantifiers in the definition of limit. 

(i) Define a sequence {xn)n of real numbers to be W-convergent to x if and only VA'' G N3e > OVn > 

N[\Xn -X\ < £]. 

(ii) Define a sequence {xn)n of real numbers to be S-convergent to x if and only 37V G NVe > OVn > 
N[\x„-x\ < e]. 

Now do the following: 

(a) Describe all the W-convergent sequences, and show that every convergent sequence is W-convergent. Give 
an example of a W-convergent sequence which is not S-convergent. 

(b) Describe all the S-convergent sequences, and show that every S-convergent sequence is convergent. Give 
an example of an S-convergent sequence which is not convergent. 

(c) Show that W-limits may not be unique, but that S-limits are unique. 

□ 

2.3 Arithmetic, Order and Convergence 
2.3.1 Arithmetic and Convergence 

It can be quite difficult to prove that a sequence x„ converges to a particular limit x if we 



use only the definition of convergence (i.e. Definition 2.2.7). For e xample , to show that 
3"^"*^, 2 directly from the definition was quite tricky (cf. Example 



2.2.13). 



Yet it is easy to "see" that -^P^^ 2: 

■' Jl'^— 71+1 

2n3 + 5 ^ 2 + J, 
77,3 - n + 1 1 K + X 

Now ^ ^ 0, so the numerator (2 + Jj) ^ 1. Similarly, l-^ + ^^l. Thus ^ f . 

If we analyze the above "proof, we see that we have made the following assumptions: 



Sequences and Series 



33 



(i) If and Un — > y, then {xn + Un) (x + y). Thus Hm„(x„ + 7/„) — Hm„ x„ -|-hm„ 
i.e. 

The hmit of a sum is the sum of the hmits 

It doesn't matter if we first add the x's to the y's and then take the hmit, or if we first 
take the hmits of the x's and the y's, and then add those. Addition commutes with 
hmit. 

(ii) If Xn X, and y„ ^ y, then ^ f (assuming y / 0). Thus hm^ |^ = jg^, and 
division commutes with hmit. 

Of course, this needs proof. 



Theorem 2.3.1 Let {xn 

Then 


)n, {yn)n be Sequences in M, with x„ x, yn ^ y- Also let a G M. 


(a) {xn + yn) ^ x + y; 




(b) axn ax; 




( c) Xnyn xy; 




(d) ^ ^ ^ (provided x„ 


7^ for n gN, and x ^ 0); 


(e) 1^ ^ 1 (provided y„ 


/ /or n G N, and y ^0). 



To help you understand what exactly must be accomplished, we'll leave the first two as 
exercises. Do this exercise NOW! 

Exercise 2.3.2 (a) Suppose that x„ — > rr, y„ — > y in M. We will show that (x„ + y„) — > x + y. 

(i) We must show that, given any e > there is G N such that \{xn+yn) — {x+y)\ < e 
whenever n > N. Make sure you understand this. 

(ii) Explain why we can find Ni such that |xn — x| < | whenever n > Ni. 

(iii) Similarly, we can find an N2 such that the same holds for y„. 

(iv) Let N = max{Ai, A2}. Use the triangle inequality to deduce that \{xn + yn) — {x + 
y)\ < e when n> N. 

(v) Conclude that lim„(xn + yn) = x + y, as required. 

(b) Let x„ — > X in M, and suppose that a G M. Show that ax„ ax. 

[Hint: If a 7^ 0, choose A G N such that |x„ ~ 2;| < whenever n > N. Also, don't 
forget the case a = 0.] 

□ 



Proof of Theorem |2.3.1| I hope you did Exercise 2.3.2 



We need only prove (c)-(e), as (a), (b) were dealt with there. So suppose that x„ x, yn 
in ]R\ and let e > 0. Note that 



XnVn - xy = Xn{yn - V) + y{Xn - x) 



34 



Arithmetic, Order and Convergence 



Now because the sequence converges, it is bounded (by Proposition 2.2.22), and thus 

there is a i^i > such that \xn\ < Ki, for all n E N, and also \x\ < Ki. Similarly, there is 
K2 > such that ly^l < K2 for all n E N, and also \y\ < K2. 
Now choose A^i , such that 



n> Ni ^ \xn — x\ < 
n> N2^\yn-y\< 
Put N = max{7Vi, iV2}- If n > A^, we have 



2K2 

£ 

2Ki 



IXniVn - y)\ = \Xn\ ' IVn - y\ < Ki 



2Ki 2 

Similarly, n '> N implies \y{^Xn — x)\ < |. Thus 

n> N ^ \Xnyn - Xy\ < \Xn{yn " y) I + \y{Xn - x)\ <£ 

and this proves (c). 

To prove (d), suppose that {xn)n is a sequence of non-zero real numbers which converges 
to X, where also x 7^ 0. We first show that the sequence {^)n is bounded, i.e. that there is a 
K such that, for all n E N, we have |^| <K. For, since |2;| 7^ 0, we can choose Ki > such 
that < l^l (by the Archimedean Property). Now since ^ x, we also have | 

Xn\ ^ l*^! 

(by Exercise 2.2.15), and thus \xn\ > 7^ eventually. Formally, there is A'' such that n > N 
implies ^ < |x„|. Now define 

K = max{ - — - , . . . , r , Ki } 

\xi\ \X]\l\ 

Then |^ < for all n E N, proving that {-^)n is bounded (by K). 
Next, note that 

1 1 I X X ^ I 

Xn X 

Given now an e > 0, we can find an N such that 



XnX\ 



because Xr. 



X. It follows that 

n > iV => 



n> N 



1 1 

Xfi X 



X < 



1^ ^ ^ 



as 



^ < K^. Thus — > -, as required. This proves (d). 



\XfiX\ 

Finally, (e) is an easy consequence of (c) and (d): Just notice that |^ = ' ^ 



Exercise 2.3.3 Suppose that {yn)n is a non-negative sequence in M which converges to y. 
We show that then also — > y^. 



Sequences and Series 



35 



(a) We consider, separately, two cases: y > and y = 0. First assume that y = 0, i.e. that 
j/n — 0. Prove that in that case also — > 0. 

(b) Next assume that y > 0. Explain why there is a K > such that y > K and such that 
Un^ K eventually. 



(c) Now notice that - ^){.Jy^ + y/v) = iVn - y), and that + y^) > 2VK. Use 

these facts to show that y^ y/y in this case also. 
[Hint: Choose N such that \yn — y\ < l^fKe whenever n > N.] 

□ 

Exercise 2.3.4 Let {xn)n be an arbitrary sequence of real numbers. Construct a new se- 
quence {yn)n of Cesaro means by defining 

Xi+X2-\ \-Xn 

Vn ■= 

n 

(a) Show that if {xn)n converges, then so does {yn)n, and to the same limit. 

[Hint: Suppose Xn — x. First choose M such that \xn — xj < | whenever n > M. Then 
choose N>M such that < ^ for all k < M. Then show 

f\xi-x\ |a;M-a;|\ f\xM+i-x\ \xn-x\ 

IVn -X\< { 1 1 + h . . . 

\ n n J \ n n 

and let n> N. ] 

(b) Prove or Disprove: If {yn)n converges, then so does {xn)n, and to the same limit. 

2.3.2 Order, Completeness and Convergence 

Note that 

Xn^ X iff Ve > 3Ar G N Vn > N[x - e < Xn < x + e] 

This is because |y — <£iSx — e<y<x + e. Thus to say that x„ — x means that, for 
any e > 0, eventually the sequence lies in the open interval {x — e,x + e). 
The next theorem is left as an exercise: 



Theorem 2.3.5 (Sandwich Theorem, or Squeeze Theorem) 

Suppose that {xn)ni {yn)n o-^d {zn)n o,i~^ sequences in M which satisfy the following condi- 
tions: 

(i) Xn < yn for all n E N (or merely eventually^; 
(a) There is I eM. such that Xn — ^ I and Zn — ^ /. 
Then also yn ^ I- 



Exercise 2.3.6 The aim of this exercise is to prove the Sandwich Theorem. So let s > 0. 

We must show that there is e N such that jyn — < £ whenever n > N, oi equivalently, 
that I — e<yn<l + £ whenever n > N. 



36 



Arithmetic, Order and Convergence 



(a) Assume first that Xn < t/n ^ Zn for all n G N. Explain why there is iVi G N such that 
whenever n > A^i, we have I — e<Xn<l + S- 

(b) Now explain why there is an A*" E N such that whenever n > N, we have both I — e < 

Xn < I + £ and I — e<Zn<l + £- 

(c) Now explain why also I — £<yn<l + £ whenever n> N . 

(d) The Theorem has now been proved for the case where .t„ < y„ < z„ for all n € N. Modify 
your proof slightly to show that the Theorem remains true if we have Xn < Vn ^ 
eventually. 

□ 

Example 2.3.7 We use the Sandwich Theorem to show that if |a;| < 1, then (as 

n — oo). This is obvious if a; = 0. Now if < \x\ < 1, then |^ > 1, i.e |^ = 1 + ^ for some 

positive h. Thus |a;|" = jf^^- Now by the Binomial Theorem, 

(i..r^(;).».(';)/..(;>^. 

Hence < \x\'^ < Now we proved earlier that ^ — 0, and thus ^ — as well. Put 
a„ = 0, 6„ = ^ for all n. Then 

an ^ 0, 6„ — > and a„ < < 6„ 

By the Sandwich Theorem, also l^l" — 0. It follows easily that — as well. 

□ 

Exercise 2.3.8 Show that -v/n — *■ 1 as n — *■ oo. 

[Hint: Let Xn = -(/n — 1, and use the binomial theorem to prove that n > "^^xl. Then 
use the Sandwich Theorem to show that a;„ — > 0.] 

□ 

Next, we consider monotone sequences: 
Definition 2.3.9 Let {xn)n be a sequence in M. 

(a) {xn)n is said to be an increasing sequence if and only if n < m implies x„ < Xm, i-e. 
iff 

Xl < X2 < Xs < ■ ■ ■ < Xn < ■ ■ ■ 

(b) {xn)n is said to be a strictly increasing sequence if and only if n < m implies Xn < Xm, 
i.e. iff 

Xi < X2 < Xs < ■ ■ ■ < Xn < ■ ■ ■ 

(c) Decreasing and strictly decreasing sequences are defined by replacing < with >, and 
< with >. 

(d) A monotone sequence is one that is either increasing or decreasing. A strictly monotone 
sequence is one that is either strictly increasing or strictly decreasing. 

Warning: In the literature, what we call an increasing sequence is often called a non-decreasing sequence, 
and what we call a strictly increasing sequence is often called an increasing sequence. 



Sequences and Series 



37 



Examples 2.3.10 (a) 1, 2, 3, ... , is strictly increasing, and thus increasing. 

(b) Any constant sequence is both increasing and decreasing. However, it is neither strictly 
increasing, nor strictly decreasing. 

(c) The sequence 1,2,1,2,1,2,... is neither increasing nor decreasing. 

□ 

The following fundamental — and surprisingly useful — fact is just the Completeness Axiom, 
couched in sequence terminology. 

Theorem 2.3.11 Let {xn)n be a monotone sequence in R. Then {xn)n converges if and 
only if it is hounded. 

Moreover, if {xn)n is a bounded increasing sequence, it converges to sup„a;„, whereas if 
{xn)n is a hounded decreasing sequence, it converges to inf„Xn. 

Proof: (=^>) If {xn)n is any convergent sequence (monotone or not), then it is bounded, by 
Proposition |2.2.22| 

(<^) Suppose that {xn)n is an increasing bounded sequence. By the Completeness Axiom, 
there is a number 

x* = sup{j;„ : n S N} 

n 

We will show that a;„ — > x* . So let e > 0. Then x* — e \s not an upper bound of the set 
{xn '■ n £ N} (because x* — e is strictly less than the least upper bound x*). It follows that 
there is no S N such that Xno > x* — e 
Now if n > no we have 

(i) Xn > Xno, because {xn)n is increasing. 

(ii) Xn < X*, because x* is an upper bound of the sequence elements. Hence | 

Since we can find such an no for every e > 0, we have shown that 

Ve > 3no G N Vn > no[|x„ - x*\ < e] 

which is the same as saying Xn x* . 

We have therefore shown that any bounded increasing sequence converges. Suppose now 
that {xn)n is a bounded decreasing sequence. Then {—Xn)n is a bounded increasing sequence, 
and thus converges. It is now easy to see that {xn)n converges as well. 

H 

Remarks 2.3.12 If {xn)n is an increasing sequence which converges to x, we often write 

Xn} X or X =1 limx„ 

n 

instead of Xn x. Similarly, if {xn)n is a decreasing convergent sequence, we write 

Xn i X or 3;=|limx„ 

n 



38 



Arithmetic, Order and Convergence 



□ 

Note that the Completeness Axiom, via preceding theorem, guarantees that a bounded 
monotone sequence has a Hmit, even if we have no way of directly determining what that 
limit is. This is the case in the following example: 

Example 2.3.13 Define x„ := (1 + ^)". We show that {xn)n converges. By the preceding 
theorem, it suffices to show (i) that {xn)n is increasing, and (ii) that it is bounded. Now by 
the Binomial Theorem 

nl n(n — 1) 1 n(n — 1)...2.1 1 



In 2! n! n" 

= 1 + 1 + 1(1-1) + .. . + 

z! V n/ nl V n/ V n/ \ n / 

Similarly, 

Xn+l = 1 + 1 + ;^ f 1 +■■■+ , ^ f 1 (\ - . . . f 1 



2!V n + l7 (n+l)!V n + l7v n + l7 V n+1 

Now wc compare terms. .t„ has n + 1 terms, whereas, has n + 2 terms, all non-negative. 
Xn and agree on the first two terms. Now if2<A;<n + l, then the k^"^ term of x^ is 

(/c-l)!V n)\ n) \ n 
whereas the A;**^ term of is 



(A;-1)!V n + l7v n+l7'"V n+1 

It is therefore clear that the k^^ term of a;„ is less than the k^^ term of Xn+i- Moreover, Xn+i 
has one more term, which is strictly positive. It follows that Xn < Xn+i- 
Thus {xn)n is an increasing sequence. 

Next, we show that {xn)n is bounded. Look again at the k^^ term of x„: We have 



This is (i), because 
and (ii), because 

It follows that 



ik-iy\ JV n^'A n J - 2^- 
2*^-2 = 2- 2 2<2-3 {k-l) 

( n. 



< 1 • 1 



Xn<l + l + ^ + ^ + --- + ^<3 

and thus that a;„ < 3 for all n. Hence {xn)n is bounded. 

We can now conclude that {xn)n converges, though we do not yet know precisely where 
it converges to. 

If you stick the Xn into a calculator, you will see that Xn — » e, where e = 2.7182818 ... is the base of the 
natural logarithm. 



Sequences and Series 



39 



□ 

Exercise 2.3.14 Consider the following inductively presented sequence 

Xi = l Xn+l = Vxn + 1 

(a) Show that x„ < 2 for all n G N. 

(b) Show that (xn) is increasing. 

(c) Conclude that x = lim„ Xn exists. 

(d) Prove that lim„ \/xn + 1 = \/x + 1. 

(e) Conclude that lim„ a;„ = , the Golden Ratio. 

[Hints: (a) Use mathematical induction, (b) Induction again: Assuming Xn < Xn+i, show 



that Xn+i < Xn+2- (d) Exercise 2.3.3 



□ 

Remarks 2.3.15 In the previous exercise, we were given an inductively defined sequence in 
the form 

Xi= C Xn+l = f{Xn) 

where / is a continuous function. In that case, if the limit x = lim„ x„ exists, then it is a 
fixed point of /: it satisfies the equation 

f{x) = X 

This is because 

X = limxn+i = lim/(a;„) = /(limx„) = f{x) 

n n n 

Interchanging the order of function / and limit is permitted because / is continuous — 
something which we will discuss in more detail in a later chapter. 

Beware, however: Blindly applying the above reasoning to the sequence 

X\ — 1 XnJ^X — Xn 

yields —x = x (where x = lim„x„), and so x = 0. But, of course, the sequence (x„) is just 
1,-1,1,-1,..., which does not converge. Therefore, before you can apply the above method 
to find the limit, you must be sure that the given sequence actually has a limit. 

□ 

Exercise 2.3.16 Show that the following sequences converge, and hence find their limits: 

(a) Xi = 1, Xn+l = :^^Xn 

(b) Xi = 1, Xn+l = (x„ + l)/3. 



□ 



40 



Introduction to Series 



2.4 Representation of Real Numbers by Decimals 

Exercise 2.4.1 We are very used to representing a real number by a decimal expansion, e.g. 
vr = 3.14159265 .... When we investigated the structure of the reals, however, we decided 
that the reals are a complete ordered field — decimals weren't mentioned once. Right now, 
in our development of the reals, we do not yet know what we mean by an expression like 
3.14159265 .... So here are two questions: 

(I) What, exactly, do we mean by an expression 3.14159265 . . . ? There are infinitely many 
numbers in this expansion. — Can we give this expression a precise meaning? 

(II) Can every real number be represented by a decimal expansion? 

We restrict ourselves to numbers in the unit interval [0, 1). Recall that [x] is defined to be 
the greatest integer which is < x, e.g. [vr] = 3, [\/2] = [1.9] = [1] = 1. 

(a) Consider the expression 0.01020304 . . . , where the o„ are integers with < o„ < 9. We'd 
like to assign a meaning to this expression. Define Sn = + {§1 + • • • + j§k- Show that 
the sequence converges. 

(b) Spend some time convincing yourself that it would be a good idea to define 

O.01O2O3O4 • • • = lim Sn 

n 

(or else, if you disagree, find a different method which assigns a meaning to the expression 
O.01020304 . . . ). 

(c) Moving on to Question (II), we want to show that every real number has a decimal 
expansion. So let s E [0, 1). Define two sequences {xn)n and (on)n inductively, as follows: 
Put xi = s. Assuming that xi, . . . , x„ and oi, . . . , o„_i have been defined, put 

an = [10x„] Xn+l = lOXn - 0„ 

Find the first 5 terms of each sequence if s = ^ . 

(d) Explain why < < 1 for each n. 

(e) Hence show that each o^ is an integer, and that < o„, < 9. 

(f) Show (by induction) that x„+i = 10"s - (10"~^oi + 10"-2o2 + • • • + o„). 

(g) As above, define s„, = X]fc=i n^- Conclude that < s — s„ < 10~". 

(h) Hence show that s„ — > s, and thus that s = O.01O2O3O4 .... 

□ 

So we have now shown that every real number has a decimal representation. Of course, 
we have not shown that such a representation is unique — Indeed it isn't. For example, 
0.25 = 0.24999 .... But it is only terminating decimals that have another representation. If 
you think carefully, you will conclude that all reals have a unique non-terminating decimal 
representation. 

^i.e. decimals that end, eventually, in all zeroes. 



Sequences and Series 



41 



2.5 Introduction to Series 

In this section, we are concerned with numerical expressions of the form 

Xi + X2+ X3-\ \- Xn + ■ ■■ 

which we may also write as 

oo 
k=l 

Such an expression is called an infinite series, or just a series. 
2.5.1 The Paradoxes of Zeno 

The Greek philosopher Zeno (of Elea, ca. 450 BC) is responsible for several delightful para- 
doxes. The most well-known paradox involves the swift -footed warrior Achilles, whose rage 
is the cental motive of Homer's epic The Iliad, and a Tortoise who studied under Socrates. 
Briefly, the story has the following plot: 

The Tortoise challenges Achilles to a race, claiming that he would win 
over any distance, provided that Achilles give him a small head start. 
Achilles agrees, with alacrity, and a date is set. Just before the race 
begins, Achilles and the Tortoise engage in some idle chit-chat: 
TORTOISE: You may as well concede the race now, oh Achilles. It is 
logically impossible for you to win. 

ACHILLES: . . . (Though the son of a Goddess, Achilles doubts whether 
the Gods themselves are immune from the constraints of logic. . . ) 
TORTOISE: Let me demonstrate. You agree that you will run the the 
distance of my head start in quite a short time? 

ACHILLES: Indeed, a very short time. After I have run to your start- 
ing point, I will be practically upon you — you will be ahead only very 
slightly. 

TORTOISE: But you agree that I will be ahead. . . Nevertheless, you will 

cover the remaining distance between us quite quickly? 

ACHILLES: Indeed, extremely quickly. When I cover that distance 

you 'II have inched forward hardly any distance at all. 

T0RT0ISE:5?it / will have inched forward some distance, and will 

therefore be ever so slightly ahead. . . Now you will presumably cover the 

new distance between us in very little time? 

ACHILLES:j4s you say in hardly any time at all. You will barely have 
moved. . . Oh. . . 

TORTOISE: / see you have spotted the problem, Achilles. No matter 
how fast you move, when you reach the spot where you saw me last, I 
will have moved ahead, be it ever so slightly. Thus you will never catch 
up with m,e. 

ACHILLES: It is as you say, oh Tortoise. . . 



42 



Introduction to Series 



says Achilles, and proceeds to win the race with ease. qJ 

Let us examine the Tortoise's argument. We know, from experience, that the Tortoise 
is wrong. We also know, from experience, that logic never lets us down, only the faulty 
application of it. Thus we (rightly) suspect an error in the Tortoise's logic. 

Suppose that Achilles and the Tortoise agree on a 50 meter race, and the Tortoise is to 
get a 9 meter head start. Suppose further that Achilles, in full battle gear, is able to attain 
a speed of lOm/s, and that the Tortoise can crawl at a very respectable Im/s. 

• Firstly, let's do a quick calculation to determine who wins. Achilles reaches the finish 
line in 5 seconds (50 meters at lOm/s). At that time the Tortoise, travelling at Im/s, 
will only be at the 14 meter mark (9m + 5m). Thus Achilles wins, by 36m. 



Next, let us use a simple algebraic argument to determine at what time (call it T) 
Achilles overtakes the Tortoise. At time T, Achilles will have travelled a distance of 
lOT, whereas the Tortoise will have crawled to the point 9 + T. Thus lOT = 9 + T, 
which implies that T = 1. Thus, Achilles overtakes the Tortoise at T = 1 seconds. 

Next, let us carefully examine the Tortoise's argument: Achilles will reach the Tortoise's 
starting point (the 9m mark) in ^ of a second. At that stage the Tortoise will have 
crawled ^ meters ahead. Achilles will cover this distance in a mere of a second, 
but the Tortoise will have used this time to inch forward another meters. Achilles 
will cover this distance in just of a second, giving the Tortoise enough time to gain 
an additional meters.. . . 

The Tortoise's argument can now be summed up as follows: From the above, we can 
see that Achilles will catch up with the Tortoise at time 

^99 9 

T = \ \ h . . . 

10 100 1000 

According to the Tortoise, Achilles will never catch up, and thus the time T at which 
Achilles overtakes is (informally) T = +oo. However, a simple algebraic argument 
proved that T = 1. 



The flaw in the Tortoise's argument is therefore this: 

The Tortoise assumes that a "sum" with infinitely many 
strictly positive terms must necessarily add up to infinity. 

Because we know that the Tortoise is wrong, we now have valuable information: It must be 
possible to add up infinitely many strictly positive terms, and yet end up with a finite answer. 
However, addition is a binary operation: You can add up only two numbers at a time. Thus, 
for us, an expression such as 

9 9 9 

10 100 1000 ■ ■ ■ 

has, as yet, no meaning. We need to provide it with a meaning. Moreover, for that meaning 
to be consistent with our algebraic argument, our definition must ensure that 

9 9 9 

\ \ \ = 1 

10 100 1000 

^ In Zeno's version, Achilles concedes the race, dazzled by the Tortoise's logic. 



Sequences and Series 



43 



In the previous chapter we proved that every real number has a decimal representation. It should be clear 
that, in decimal notation, 

9 9 9 

— + — + — + • • • = 0.9999 . . . 

10 100 1000 

But (of course) 0.9999 • • • = 9 x (0.1111 . . . ) = 9 x | = 1, so there is no contradiction here. 

Exercise 2.5.1 Here's another one of Zeno's paradoxes: Motion is impossible. 
For suppose that I want to walk a distance of x > meters. Before I get to the x meter mark, 
I would first have to reach the ^x meter mark. And before I can reach the \x meter mark, I 
would first have to reach the meter mark. . . 

Thus there are infinitely many distances I have to travel before I reach the point x, and doing 
infinitely many things will take me an infinite amount of time. 
Examine this argument, and point out the flaws. 

□ 

2.5.2 Convergence of Series: Definition and Examples 

Note that, as written, the expression 

Xl + X2 + X3-] \-Xn + ■ ■ . {*) 

seems to require an infinite number of additions. We have no idea what that might mean, 
because addition is a binary operation — we can add only two numbers at a time. We 
therefore need to find a clear and unambiguous meaning for (*). 

A careful examination of the resolution of Zeno's paradoxes leads us to believe that it may 
be possible provide such a meaning. 

We shall do this as follows: For each n, define 

n 

Sn = Xi + X2-\ \-Xn = y^^Xk 

k=l 

Each Sn is called a partial sum of the series, and is a well-defined number, as it involves only 
finitely many additions. We thus obtain a sequence (s„)„ of partial sums. We shall define 
the series (*) to be the sequence (s„)„. 
Thus a series is a sequence, not a number! 
In particular, 

oo n 

The series ^^^fc the sequence ( ^fc)„ 
fc=l k=l 

If the series of partial sums converges, i.e. if s, then we shall say that the series converges 
to s, and write 

oo n 

EXk = s instead of lim > Xk = s 
n— »oo ' 

k=l k=l 

Thus when we say Yl'k^=i = s, we mean that the sequence of partial sums converges, and 
that its limit is s. A series which does not converge is said to diverge. 

A series converges to s if and only if you can get as close to s as you 
like by adding up sufficiently many terms of the series. 



44 



Introduction to Series 



Some remarks on notation: We will write YlT=m for the series Xm + Xm+i + Xm+2 + • • • • 
We may also write Xn instead of X^n^i •^f^i '^^ there is no danger of confusion. 

Furthermore, we write = oo if s„ ^ oo. But note that in that case the series is 

divergent. 

Example 2.5.2 Let x G M with |x| < 1. Then the series X^fc^o sequence {sn)n, 

where 

s„ = 1 + X + H h 



1 -X 



1 

l-x- 



Now x""*""^ ^ as n — > oo, because |x| < 1 (see Example 2.3.7). It follows that s„ 
We therefore write 

oo -. 

^ 1 - X 

fc=0 

This does not mean that if you add up all the (infinitely many) x'' , you end up with — you can't add up 
infinitely many terms. Instead, it means that the sequence of partial sums converges to jh^- 



□ 



Example 2.5.3 The series 
diverges. For 



1-1 + 1-1 + 1-.. 

J 1 if n is odd 
I if n is even 



i.e. {sn)n is the sequence 1, 0, 1, 0, ... , and that diverges. 

□ 



Combining Examples |2.5.2 and 2.5.3 leads to the following result: 



Theorem 2.5.4 The series X/nio'^" converges if and only if \x\ < 1. 
When |x| < 1 we have 

oo -, 



n=0 



1 — X 



Example 2.5.5 (Harmonic Series) 
We prove the following important fact: 



oo ^ 

The harmonic series > — diverges 

^ n 



n=l 



We show that the sequence of partial sums is unbounded. As every convergent sequence 
is bounded (cf. proposition 2.2.22), this implies that Xl^i n diverges. We shall accomplish 



this by proving that S2" > 1 + ^ for all n. 



Sequences and Series 



45 



We consider partial sums of the form (i.e. si, S2, S4, ss, ■ ■ ■) because we can group their 
terms in an ingenious way to obtain a lower bound: 



S2" = 1 + 2 



1 /I 1\ /I 1 1 1\ 

2 + (3 + 4) + (5 + 6 + 7 + 8) 
/ 1 1 J_\ 



Thus after the first two terms (1 and we group the remaining terms in brackets: The next 
two (| and |), then the next four, then the next eight, etc., until we get a final bracket with 
2"~^ terms. There are n — 1 brackets in total. 
Now clearly 



1111 

-+->-+- 
34-44 

11111111 
5^6^ 7^ 8 - 8^8^ 8^ 8 



1 111 1 



2^-1 + 1 2'^-^ + 2 ' ' 2" - 2" ' 2" ' '2" 

Now each expression on the righthand side adds up to exactly ^ , and there are n — 1 such 
expressions. It follows that 

11 n 
S2n>l + - + -in-l) = l + - 

Thus the sequence (sn)n is unbounded, which means that the series diverges. 

□ 

Example 2.5.6 We show that if p > 1, then the series Yl'^=i ^ converges. The proof is very 
similar to the argument presented in the previous example. We split up partial sums of the 
form S2"-i into groups with power-of-two-many terms: 

. /I 1 \ / 1 1 1 1 \ 

S2n-l = 1+ \ + 1 1 1 +••■ 

/ 1 1 1 \ 

+ [2{n-i)p + (2^-1 + 1)P + ■ ■ ■ + (2" - l)p) 



Now 



1111 1 



2P 3P ~ 2P 2P 2P-1 
11111111 



(2^-0 



11 1^1 1 1 _ / 1 Y~'^ 



+ 



2(n-i)p (2'^-^ + 1)P (2" - 1)P ~ 2("-i)P 2('^-i)P 



46 



Introduction to Series 



It follows that we have majorized the sequence S2"-i by a finite geometric series: 



S2"-l < 1 + 



2P~ 



+ 



2P- 



+ ••• + 



2P- 



n-l 1 
< — 



2P- 



2P- 



Since p > 1, we have 



(as n ^ oo), and so 



1 



< 



to see that the sequence {sn)n is bounded: Given /c G N, choose an n such that k < 2 
Then Sk < S2"-i (because all terms are positive, and S2"-i includes all the terms of s^, and 



. It is now easy 
" 1. 



possibly more). Moreover, S2" 



< 



-. We have therefore shown that 



Sk < 



1 



for all A; G N 



2P- 



and thus that the sequence {sn)n is bounded above. Since it is also an increasing sequence. 



it must converge, by Theorem 2.3.11 

Thus the series Yl'^=i ^ converges when p > 1. 



□ 



Combining Examples 2.5.5| and |2.5.6 leads to the following result: 



Theorem 2.5.7 The series X/rjii ^ converges if p > 1, and diverges if p < 1. 



Proof: Examples |2.5.5 and 2.5.6| supplied proofs for the cases p = 1 and p > 1 respectively. 
If p < 1, then ^ > K Since the partial sums of Yl'^=i k unbounded, it follows easily 
that the partial sums of X^riii ^ unbounded as well. Since a convergent sequence must 
be bounded (cf. Propn. 2.2.22), and since a series is sequence of partial sums, we see that 
J2n=i ^ diverges. 



Remarks 2.5.8 In Section 2.4, we proved that every number has a decimal representation. 
We first defined what we meant by an expression of the form 0.010203 ... ., and decided that 
we should define it to be lim„ s„, where Sn = + ^ + • • • + ■ Thus we defined 0.010203 . . . 
to be i.e. the decimal representation of a number is a series. 



□ 



Exercise 2.5.9 Suppose that {xn)n and {yn)n are sequence in M, and that a G 

(a) If Y.n^n = x and Y.n Vn = y, then J^ni^n + yn) =x + y. 

(b) If Y.n = X, then Y,n (^^r 



ax. 



[Hint: (a) Remember that series are sequences of partial sums. So let s„ = X]fc=i^fcjin = 
Z]fc=i Vk.Un = X]fc=i(^fc + Vk)- You must show that m„ {x + y), given that Sn ^ x and 



tn y- Use Theorem 2.3.1 ] 



□ 



Sequences and Series 



47 



Exercise 2.5.10 (The number e) 
Show that the series 



T- 

n=0 



^ + ^ + ^ + ^ + 



converges. The number e is defined to be the hmit of this series, i.e. 

oo ^ 



n=0 



[Hint: n\ > 2^'' if n > 2. Use this fact to show that the partial sums of ^ 



increasing bounded sequence, and invoke Theorem 2.3.11 



^=0 W. form an 



□ 



2.6 Convergence of Subsequences 
2.6.1 Subsequences 

Roughly speaking, if you write down all the terms of a sequence {xn)n, and then delete some 
of these terms, what remains is a subsequence. However, you're not allowed to delete so 
many terms that only finitely remain, nor are you allowed to rearrange the order in which 
they occur. 

This is best understood by looking at some examples: The sequence 2, 3, 5, 7, 11, ... of primes 
is a subsequence of the sequence 1, 2, 3, 4, . . . of natural numbers: 

/,2,3,A5,^7, A AMll,--- 

In the subsequence, the order of elements remains the same as what it was in the original: 2 
comes before 3 comes before 5. . . etc. in both sequences. 

The sequence 3, 2, 6, 5,9,8,... is not a subsequence of 1, 2, 3, 4, 5, ... . Not only have we 
deleted all numbers of the form 3n — 2, we have also rearranged them so that 3n is before 
3n — 1. In the sequence of natural numbers, 2 is before 3, but in this new sequence, 3 is before 
2. Such rearrangements are not allowed when you construct a subsequence. 
The following definition should now make sense: 

Definition 2.6.1 Let {xn)n be a sequence in M, and suppose that (ra^)^. is a strictly 
increasing sequence in N (i.e. ni < n2 < < . ■ ■). Then the sequence 

{•^rih }k — ^ni ) j S^na j • • • 

is called a subsequence of {xn)n- 
For example 

{X2n)n = X2,X4,X5, . . . 
{X3n-l)n = X2,X5,Xs, ■ ■ ■ 

are subsequences of {xn)n- 



48 



Convergence of Subsequences 



Remarks 2.6.2 1. One easy but useful fact to note is the following: If tt-i < n2 < < . . . 
is a strictly increasing sequence of natural numbers, then Uk > k (for each k 
If you can't see this immediately, try proving it by induction. Clearly ni > 1. Now suppose that Uk > k. 
Then nk+i > rik > k, and thus nk+i > A; + 1. 

2. Note that the n in {xn)n is a "dummy" variable — not really a variable at all. This means 
that it doesn't matter if we replace the n by some other symbol k: {xk)k is exactly the 

same as {xn)n- 

For example {\)k = 1, 5, 5, • • • = (^)n- 

In particular, lim^ x^ is exactly the same as lim„ x„, sup^, x^ the same as sup„ a;„, etc. 

In the expression (xn)n, the variable n is a hound variable, constrained to take on all possible values in 
the set N. We have a similar situation when we deal with definite integrals: The expression a; da; is a 
number, namely |, and not a variable, even though it seems to have a variable x occurring in it. However, 
that X is a bound variable, constrained to take on all possible values between and 1. It doesn't matter if 
we replace thex by some other symbol u: x dx is exactly the same as u du 

□ 

One important type of subsequence is a tail sequence. A tail sequence of {xn)n is a subsequence 
which consists of all terms of Xn from some N onwards, e.g. 5,6,7,... is a tail sequence of 
1, 2, 3, . . . . Similarly ... is a tail sequence of 1, 5, 5, |, • • • • Thus {yn)n is a tail 

sequence of {xn)n iff there is an integer N >0 such that y„ = XN+n- 

Example 2.6.3 If 

— if is odd 
n 

1 H — ^ if n is even 
then {xn)n is divergent. However, the sequences {yn)n, {zn)n defined by 



are convergent subsequences of {xn)n, with y„ 0, and Zn — ^ 1. 

If you think long enough, it should be clear that a subsequence {xn^.)k of {xn)n converges if 
and only if: EITHER the sequence {nk)k is odd eventually (in which case Xn^ 0), OR {nk)k 
is even eventually (in which case ^ !)• 

Similarly, if {nk)k is BOTH odd infinitely often and even infinitely often, then {xnk)k diverges. 

□ 

Proposition 2.6.4 (a) If Xn — ^ x, and if {yn)n is a subsequence of {xn)n> then yn ^ x 
as well. 

(b) If {yn)n is a tail sequence of {xn)n, o,nd if yn — ^ x, then also Xn — ^ x. 

Proof: (a) Suppose that yk = Xuk , where ni < n2 < ns < We must show that yk — x, 

i.e. that for every e > 0, there is a i^T G N such that \yk — x\ < e whenever k > K. 

So let e > be given. First choose M G N such that \xm — x\ < s whenever m > M. 
(Why can we choose such an M?) 



Sequences and Series 



49 



Now choose K £N such that nk> M whenever k > K. (For example, by Remarks 2.6.2 
we have Uk > k for ah fc G N. In particular, um > M, so we can choose K = M.) Then if 
k > K, we have 

(because rik > M implies — x\ < e). Thus x, as required. 

(b) Suppose that {yn)n is a convergent tail subsequence of {xn)n, and that y„ — > x. We 
must show that also Xn ^ x. So let e > 0. Choose N such that n > N implies — 2;| < e. 
Next, note that that by definition of "tail sequence", there is a non-negative integer M is 
such that yn = Xn+M- It follows that if n > + M, then n — M > N, so that \yn-M — x\ < e. 
But y-n-M = Xn-, and thus 

\xn — x\ < e whenever n > N + M 
Since we can do this for any e > 0, we have shown that x^ ^ x. 



Example 2.6.5 Consider again the sequence 

— if n is odd 
n 

1 H — ^ if n is even 
The sequences 

- 1 _ JL 

^" " 2r^ " + 4n2 

are subsequences of {xn)n- Since yn ^ and z„ ^ 1, we can conclude that is divergent. 

For if {xn)n converges (to x, say), then all its subsequences would also converge to the same 
limit X. But here we have two subsequences which converge to different limits. 

□ 



Next, we show that every sequence of real numbers has a monotone subsequence. For the 
purpose of the proof, we briefly introduce some non-standard terminology. Let {xn)n be a 
sequence of real numbers. Imagine that you are walking along a landscape, and that x„ is 
your height above sea level at time n. Call Xn a vista if you can see the whole landscape ahead 
of you, i.e. if x„ > Xm for all m > n. Thus if {xn)n is decreasing, then each Xn is a vista, 
whereas if {xn)n is increasing, there are no vistas at all. If x„ := 1 + ( — 1)"^, then every even 
point X2n is a vista. 

Theorem 2.6.6 Every sequence of real numbers has a monotone subsequence. 

Proof: We consider two cases: Either (1) {xn)n has infinitely many vistas, or (2) it has 
only finitely many. In case (1), let , x„2 j a^na > • • • be the subsequence of vistas, in order of 
increasing subscript. Note that 

Xn\ ^ Xn2 ^ Xfi,^ ^ ■ ■ ■ ^ ■^nf^ ^ • • • 

is a decreasing subsequenc of {xn)n- 



50 



Cauchy Sequences and Completeness 



In case (2), {xn)n has only finitely many vistas, so there is an A'" € N such that there are 
no vistas beyond point x^, i.e. if n > A*", then Xn is not a vista. Now construct a subsequence 
as follows. Let ni = N. Since Xm is not a vista, there is n2 > ni such that > Xm- Since 

is not a vista, there is 71.3 > n2 such that Xn^ > Xn2- Continuing in this way, we obtain 
an increasing sequence 

Xn\ ^ ^ Xfi'j^ . . . Xii^ < . . . 



2.6.2 Bolzano— Weierstrass Theorem 

This theorem is so important that it deserves a subsection all to itself: 



Theorem 2.6.7 (Bolzano-Weierstrass) Every bounded sequence of real numbers has a 
convergent subsequence. 



Proof: By Theorem 2.6.6 , any sequence has a monotone subsequence. If is bounded. 



then so is the subsequence. But a bounded monotone sequence converges, by Theorem 2.3.11 



2.7 Cauchy Sequences and Completeness 



We have already seen that any bounded increasing sequence converges (Theorem 2.3.11) — a 
fact that followed from the Completeness Axiom. This fact allowed us, in Example 2.3.13 to 
conclude that the sequence ((1 + ^)")n converges, though we could not see where it converges 
to. The Completeness Axiom guarantees the existence of a limit, even if we do not know 
what that limit is. 

Like a bounded increasing sequence, a Cauchy sequence is a sequence that "ought to" 
converge. And, as we shall see, a Cauchy sequence does converge: The existence of a limit is 
guaranteed by the Completeness Axiom, even if we do not know what that limit actually is. 

Intuitively, a sequence {xn)n in M is a Cauchy sequence if its terms lie eventually arbitrarily 
close to each other. This means that from some point onwards, any two terms are "close". If 
all terms lie closer and closer together, there should be some point that they are all clustering 
around, and that point should be the limit of the sequence 

All this "ought" and "should" needs to be made precise. 



Definition 2.7.1 A sequence {Xfi)n in M is called a Cauchy sequence if and only if for 
every e > there is an S N such that 

\xn — Xm\ < £ whenever n,m> N 

i.e. if and only if 

ye > 03N enyn,m> N - Xm\ < e] 



Remarks 2.7.2 (a) Note that all terms from some point onwards need to be within e of 
each other, not just successive terms. Thus, for example, if = 100, then not just do we 
have |xioo - xioi| < e, but also |x3oi - X15 673 428| < £• 



Sequences and Series 



51 



(b) A neat way to characterize Cauchy sequences is as follows: 

{xn)n is a Cauchy sequence <^=^> lim sup \xn — xn\ = 

N^oOn>N 

Here {=^) is obvious. (<^) follows by the triangle inequality: Given e > 0, choose such 
that sup^>;v \xk — xn\ < Then for n,m > N we have 

\Xn - Xm\ < \Xn - Xn\ + \xn - Xm\ < 2 SUp \xk - Xn\ < £ 

k>N 

□ 

Example 2.7.3 The sequence (1 + (— 1)"2~")„ is Cauchy. Indeed, given e > 0, we may 
choose N such that < |. If n,m > N, then (by the triangle inequality) 

1(1 + (-1)"2-") - (1 + (-1)'"2-™)| < 2-" + 2-™ < 2-^ + 2-^ <£ 

□ 

Lemma 2.7.4 Every convergent sequence is a Cauchy sequence 

Proof: Suppose that Xn — > x, and that we are given e > 0. We must find N such that 
\xn — Xm\ < £ whenever n,m > N . 

Now because Xn, — > x there is G N such that j;| < | whenever n > N. In particular, 
if n, m > N, then 

e e 

\Xn Xm\ ^ \Xn x\ + \x Xm\ ^ 2 ~^ 2 

Hence {xn)n is a Cauchy sequence. 

H 

So any convergent sequence is a Cauchy sequence. And this is not surprising: If the terms 
of a sequence {xn)n are eventually close to some point x (the limit), then those terms must 
also eventually be close to each other. 

More importantly, the converse is true: Any Cauchy sequence in M is convergent. To 
prove this, we will need a number of lemmas. We shall prove: 



• Every Cauchy sequence is bounded. 

• Every bounded sequence has a convergent subsequence. 

• If a Cauchy sequence {xn)n has a convergent subsequence, then is itself convergent. 

Actually, the second point has already been proved. It is the Bolzano-Weierstrass theorem 
(Theorem 2.6.7). Thus we need only prove the first and the last point. 



Lemma 2.7.5 If {xn)n is a Cauchy sequence in M, then {xn)n is bounded. 



52 



Cauchy Sequences and Completeness 



Proof: Choose N G'N such that \xn — Xm\ < 1 whenever n,m > N. (This is possible, because 
{xn)n is Cauchy — we have taken e = 1.) Now define 

K = max{|xi|, 1x21, . . . , |xAr| + 1} 

We show that is a bound for {xn)n, i-e. that |x„| < K for all n G N. 

Consider separately the two case (i) n < N, and (ii) n > N. In case (i), we obviously 
have \xn\ < by definition of K. Suppose therefore, that n > N. In that case, both n and 
A'" are > N, and thus 

\Xn\ < \Xn — X]\f\ + \xn\ < 1 + \xn\ < K 

which finishes case (ii). 



Lemma 2.7.6 If {xn)n is a Cauchy sequence, and if {xn)n has a convergent subsequence, 
then {xn)n itself converges. 

Proof: Suppose that {xn^)k is a subsequence of the Cauchy sequence {xn)n, and that Xn^ x 
(as k oo). We show that x„ ^ x (as n — > oo). 

So let e > 0. We must show that there is G N such that |x„ — x| < e whenever n > N. 
Now because {xn)n is a Cauchy sequence, we can find an A^^i such that 

n,m> Ni implies \xn — Xm\ < | 

Because Xn,. x, we can find a K such that 

k > K implies {xn,, — x| < | 

Now define N = maxjA'^i, n;^}, and let n > N. Choose k such that > A'^. Then (i) 
n,nk > A'^i, and (ii) k > K (because > N > uk)- It follows that 



■^1 ^ \Xn Xnk \ ~^ l^^^fc ^1 ^ 2 ~^ 2 



whenever n > N. 



Theorem 2.7.7 Let {xn)n a sequence in M. Then {xn)n converges if and only if it is 
a Cauchy sequence. 



Proof: (=^) is Lemma 2.7.4 



(<^): If {xn)n is a Cauchy sequence, then it is bounded (by Lemma 2.7.5). Hence it has 



a convergent subsequence (by Theorem 2.6.7). It follows that {xn)n converges (by Lemma 



Remarks 2.7.8 The fact that Cauchy sequence converge in M is depends very much on the 
Completeness Axiom. If you look back over the proof of Theorem 2.7.7, you will not see the 
Completeness Axiom mentioned explicitly. But we do use the Bolzano- Weierstrass Theorem. 
The latter's proof depended on the fact that bounded monotone sequences converge, and that 
fact, in turn, requires the Completeness Axiom. 



Sequences and Series 



53 



□ 



Exercises 2.7.9 1. (a) Prove that if {xn) converges, then hm„(x„+i — x„) = 0. 

(b) Does the converse hold? i.e., does hm„(x„ — Xn+i) = imply that (x„) converges? 

2. (a) Suppose that a sequence (x„) has \xn+i — x„| < 2~" for all n G N. Show that 
converges. 

(b) Does the same hold if we only know that \xn — Xn+i\ < ^ for all n E N? 

□ 

Exercise 2.7.10 Suppose that a < b and that < A < 1. Let (xn) be a sequence of real 
numbers defined inductively as follows: 

xi = a, X2 = b, Xn+i = Ax„_i + (1 — X)Xn for n > 2 

(a) Show that \xn+i — Xn\ = A|xn — 2;n-l|- 

(b) Conclude that — Xn\ = X^~^{b — a) 

n—m—l 

(c) Prove that if n > m, then \xn — Xm\ ^ {b — a) ■ X"^~^ Yl 

k=0 

[Hint: Triangle inequality.] 



(d) Deduce that \xn — Xm\ < - — j^J^ when n > 



m. 



(e) Now prove that {xn) converges by showing that it is a Cauchy sequence. 

2.8 Further Results on Convergence of Series 

2.8.1 Cauchy Criteria 

The following result is often useful: 



□ 



Theorem 2.8.1 (Cauchy Criterion) 
The series Yl'i^=i converges if and only if 

m 

For every e > there is N £ N such that m > n > N implies | | < e 

fc=n+l 

Proof: Recall that a sequence converges if and only if it is a Cauchy sequence (Theorem 



2.7.7). Now the series Y2'^=i^n is the sequence {sn)n of partial sums, and will therefore 



converge if and only if (s„,)n is a Cauchy sequence. Now note that if m > n, then 



■^n I 



k=n+l 



and we can make \sm — Sn\ as small as we like by taking m, n to be sufficiently large. 



54 



Further Results on Convergence of Series 



Taking m = k, n = k — 1 in the preceding theorem, we see that \sm — Sn\ 
whenever k > N. It follows immediately that: 



\xk\ < e 



Corollary 2.8.2 If Yl^=iXn converges, then 



0. 



Alternatively, if X^J^i = s, then the sequence of partial sums has Sn — > s, and thus 
lim„ Xn = lim„(s„ - Sn-i) = lim„ s„ - lim„ s„_i = s - s = 0. 



Note that the converse of Corollary 2.8.2 is not true — see Example 2.5.5 on the divergence 
of the harmonic series. However, we can say something about convergence if the terms of the 
series alternate in sign. An example of an alternating series is 



' n 2 3 4 



n=l 



Theorem 2.8.3 (Alternating Series) Suppose that {xn)n is a decreasing non-negative se- 
quence in R such that a;„ — > 0. Then the alternating series 



= xi - X2+ X3 - X4 + 



n=l 



converges. 



Proof: Note that, since {xn)n is decreasing, we have xi^ — ^^jt+i 

> 0, for all A; G N. It follows 

that, if m > n, then 

< Xn+l - Xn+2 + Xn+3 ± a^m < a^n+l 

We now apply the Cauchy criterion. As always, let s„ = Ylk=ii~^)^~^^^k, and let e > 0. Since 
Xn 0, there is A'' G N such that Xn < £ whenever n > N. It follows that if m > n > A", 
then 

|Sm Sfi\ — l^n+1 ~l~ • • • i Xm\ ^ ^n-]-! ^ ^ 

Thus ( 

Sn)n is a Cauchy sequence, and thus convergent. 



Example 2.8.4 It follows from Theorem 2.8.3 that the alternating harmonic series 

n=l 

converges. 



n 



-, 1 1 1 
^-2 + 3-4 + -' 



□ 



The method of Examples 2.5.5 and 2.5.6 can be generalized. We state here the following 
result: 



Sequences and Series 



55 



Theorem 2.8.5 (Cauchy Condensation Test) 

Suppose that {xk)k is a decreasing non-negative sequence. Then the series YlT=i con- 
verges if and only if the series 

oo 

^ 2^X2fe = Xi + 2X2 + 4X4 + 8X8 + . . . 

A:=0 

converges. 



For example, X^^i ^ converges if and only if X^^g '^"'^ converges. But the latter series 
obviously diverges, and thus the harmonic series diverges as well. 



The proof of Theorem 2.8.5 is left as an exercise: 



Exercise 2.8.6 We prove Theorem 2.8.5 



Let {sn)n and {tn)n be the sequences of partial sums of the above series, i.e. 

n n 
Sn — ^ ^ X/; — ^ ^ 2 



k=l k=0 



It is clear that {sn)n and {tn)n sue increasing sequences. We must show that converges 
if and only if (t„)„ converges. 

(a) Explain why it suffices to show that {sn)n is bounded if and only if (tn)n is bounded. 

(b) Suppose that n < 2^. Explain why 

Sn < Xi + (X2 + X3) + (X4 + X5 + X6 + X7) H h (x2fc H h X2fc+l_i) < tfc 

(c) Deduce that if {sn)n is unbounded, then so is {tn)n- 

(d) Next, let n > 2'^. Explain why 

•Sn > Xi + X2 + (X3 + X4) + (X5 + X6 + X7 + Xs) H h (X2fe-l+i H h X2fc) > ^ifc 

(e) Deduce that if {tn)n is unbounded, then so is {sn)n- 

(f) Now explain why YlT=i converges if and only if Yl'k'=o 2'^X2fc converges. 

□ 

Here's some more practice in the use of Theorem |2.8.5t 

to prove that Yl'^=i n(\nn)p converges if p > 1, and di- 



Exercise 2.8.7 Use Theorem 
verges if p < 1 . 



2.8.5 



□ 



If Ylin is a series, then by a tail series we mean a series of the form X^^^+i x„. Note 
the following rather obvious, but useful, facts: 



56 



Further Results on Convergence of Series 



Proposition 2.8.8 (a) Let Xln-^" series, and let N £N. Then ^^Xn converges if 
and only if the tail series Yl'ri=N+i -^^ converges. In that case 

oo N oo 

^ ^ Xn — ^ ^ Xn ~\~ ^ ^ Xn 

n=l 71=1 n=N+l 

(b) The series X^n^n converges if and only if the sequence of tail series (^'kLn+i^k)n 
converges to zero. 



Proof: (a) Let Sn be the n^^ partial sum of x„, and tn the n**^ partial sum of X^^jv+i ~ 

Eoo 
XN+n, 1-e. 

Sn = Xi+X2^ VXn tn = Xj^+l + Xn+2 H h XAr+„ 

Note that sn+u = sn + tn- Now the sequence converges if and only if its tail {sN+n)n 



converges, in which case they converge to the same limit — cf. Proposition 2.6.4 Since 
SN+n = Sat + tnj we See that (s„)„ converges if and only if (tn)n converges, and thus that 
Xn converges if and only if 'Y1^=n+\ converges. Moreover, if x„ = x, then 

oo oo Af 

EXn = lim tn = hm(sAr + tn) - SN = hm SN+n - SN = X - SN = ^Xn - y^Xn 
n n n ^ — ' ^ — ' 

n=N+l n=l n=l 

as required. 

(b) If X^n^n converges, then so does every tail series a;^, by (a). Similarly, if 

some tail series converges, then so does x„, and hence so do all tail series (again by (a)). 
Since 

n oo oo 
"^Xk = '^Xk+ ^ Xk = Sn+ ^ Xk 

k k=l k=n+l k=n+l 

we see that Sn J2k ^k, if and only if YlT=n+i 0. 



2.8.2 Absolute Convergence and Rearrangement of Series 

Series are sequences of partial sums, and there is always a strong temptation to manipulate 
them in the same way as finite sums. That can be dangerous, as the following example makes 
clear: 

Example 2.8.9 We have seen that the harmonic series XlnLi ~ diverges, but that the alter- 
nating harmonic series 1)"^^^ converges. Suppose that this latter series converges 
to a number s G M, i.e. 

1111111 
'^~^~2"^3~4"^5~6"^7~8"^'" 

and let Sn be the n**^ partial sum. 



Sequences and Series 



57 



Now rearrange the terms of the alternating harmonic series to obtain a series 

1111111 1 
~2~4"^3~6~8"^5~10~12"^'" 

and let tn be the n^^ partial sum of the rearranged series. Then 

11111 1 1 1 

U„ = l \ \ \ 

^ 2 4 3 6 8 2n-l 4n-2 4n 

111 1 \ /I 1 1 1 
1 + - + - + + ■■■ + - r - - + - + — + ••• + 



3 5 7 2n-l/ V2 6 10 4n - 2 

11 1 \ 

7 + - + ••• + ^ 

4 8 4nJ 

111 1 \ 1 / 1 1 1 1 

^ + 3 + 5 + 7 + --- + 2;^)-2(^ + 3 + 5 + 7 + --- + 2;73T 
1/11 1 



2V2 4 "^""""^ 2n 
1/111 1 1 

1 - 77 + TT - T + • • • + 



2V 234 2n-12n 
1 

Thus lim„t3„ = lim„ |s2n = ^s. It follows easily that t„ ^s. Thus the rearranged series 
converges, but not to the same limit as the original. 

□ 

As we shall see, the "problem" is that the harmonic series diverges, but that the alternating 
harmonic series does not. We therefore define: 



Definition 2.8.10 A series ^i^Xk is said to be absolutely convergent if and only if the 
series of absolute values X^fc l^fcl is convergent. 

A series which is convergent but not absolutely convergent is called nonabsolutely or con- 
ditionally convergent. 



Of course, a series of non-negative terms is convergent if and only if it is absolutely 
convergent. 

The notion of absolute convergence is stronger than that of convergence: 



Proposition 2.8.11 An absolutely convergent sequence is convergent. 



Exercise 2.8.12 (a) Prove Proposition 2.8.11 



[Hint: Use the Cauchy criterion, combined with the fact that | YlT=n+i < YlT-n+i \^k\-] 

(b) Exhibit a counterexample that shows that not every convergent series is absolutely con- 
vergent. 



[Hint: Example 2.8.9 



□ 



Remarks 2.8.13 (a) Prom the discussion so far, it should be clear that the series Yl'^=ii~^)"'~^^ 
is (i) divergent if p < 1, (ii) conditionally convergent if p = 1, and (iii) absolutely conver- 
gent ii p > 1. 



58 



Further Results on Convergence of Series 



(b) Suppose that Yin conditionally convergent. Then the sequence {xn)n must change 
signs infinitely often, i.e. infinitely many Xn are positive, and infinitely are negative. 

To see this: Assume, for the sake of argument, that only finitely many x„ are positive, 
i.e. {xn)n is not positive infinitely often. Then it must be non-positive eventualljj^ It 
follows that from some N onwards, the Xn are negative (or zero). Thus there is N such 
that \xn\ = —Xn for all n > (because \a\ = —a if a < 0). 



Now because Yln^n converges, so does Y'^=N+i^n (by Proposition 2.8.8), and thus so 



does Y2'^=N+i (because x„ = — for n > A). A tail of the series Y 
therefore converges, which immediately implies that Yl^=i \^n\ itself converges as well 



n=l 



(again by Proposition 2.8.8), i.e. that X^n^n absolutely convergent — contradiction. 



A similar argument holds for the case where only finitely many of the Xn are negative. 

□ 

We now tackle the problem of rearrangements of series which are absolutely convergent. 
As we shall see, the problem disappears. But first, we need a definition of rearrangement: 

Definition 2.8.14 Let / : N ^ N be a bijection. The series Y'ri=i-^f(n) is called a 
rearrangement of a series Yl'k^i ^k- 



To say that / : N ^ N is a bijection is equivalent to saying that the list 

/(l),/(2),/(3),... 

contains every member of N once and only once. Essentially, a bijection of N to N gives us 
the elements of N in rearranged order. The rearranged series 

X3 + XT + Xi + Xg + X5 + . . . 

is obtained from Yl'^=i by a bijection / having 

/(I) = 3, /(2) = 7, /(3) = 1, /(4) = 9, /(5) = 5, . . . 



You can easily check that the rearranged alternating harmonic series of Example 2.8.9 is 
obtained from the alternating harmonic series by the bijection 

f{3n - 2) = 2n - 1, f{3n - 2) = 4n - 2, f{3n) = 4n 

We are now able to formulate the main result of this section. The proof is left as an 
exercise: 

Theorem 2.8.15 IfYl^=i^k converges absolutely, then every rearrangement ofY^=i^k 
converges, and to the same limit. 



Exercise 2.8.16 We prove Theorem 2.8.15 



Suppose that Y^^=i = s, and let {sn)n be the associated partial sums. Let / : N — > N 
be a bijection, and let x'„ = Xf(^n)- We must show that for the rearranged series, we have 
Yl'^=i x'n = s as well. Let (s[j)n be the partial sums of the rearranged series. 



Because -^{{xn)n has P i.o.) = {{x„)„ has -^P ev.). cf. Remarks 2.2.3 



Sequences and Series 



59 



(a) Let £ > 0. Explain why we may choose N £ N such that m > n > N imphes 

Em \ \ ^ 

k=n+\ \^k\ < fc- 

(b) Explain why we may choose iV' S N such that 

{l,2,...,iV}C{/(l),/(2),...,/(iV')} 
[Hint: Consider max{/~i(l), /-i(2), . . . , /-^(iV)}] 

(c) Explain why the terms x\^X2, ■ ■ ■ occur in both s„ and if n > N' . These terms 
will therefore cancel in the expression — s„. 

(d) It follows that if n > A^', then — s„ contains only terms for which k > N. Explain 
why we must then have 



< e whenever n > N' 



(e) Finally, note that 

and conclude that s' — > s 



s 



I 1^1' I _L I 



□ 



Remarks 2.8.17 Here's an interesting fact: Suppose that Yln-^n is a conditionally conver- 
gent series, i.e. that it converges, but not absolutely. Let c be any real number whatsoever. 
Then we can find a rearrangement of Xn which converges to c. 

The proof of this assertion is not hard, and we will merely outline it here: Given a number 
c > 0, say, add successive non-negative terms of x„ until they exceed c. (This is possible 



by the next exercise. Exercise 2.8.18). Then add successive negative terms until we obtain 
a partial sum less than c, then add unused successive positive terms until the partial sum 
exceeds c, then unused successive negative terms until. . . . Since Xn converges, the terms 
Xn — > 0, so it is not hard to see that this process will result in a rearrangement of ^„ Xn that 
converges to c. 

□ 

The next exercise shows that the positive and negative terms of a conditionally convergent 
series must "add up" to ±00: 

Exercise 2.8.18 Suppose that J]]^ x„ is conditionally convergent. Let {yn)n, {zn)n be the subsequences of 
{xn}„ consisting, respectively, of non-negative and negative terms. We show that t/„ = +cxd, Zn — —00. 

Suppose that X^n ~ -^^ X„,Yn, Z„, be, respectively the n**^ partial sums of X^n^^'X^n^" ^^'^ 
^„Zn. Furthermore, let An be the ■n}^ partial sum of \^n\- 

(a) Explain why Xn = Yn + Zn, and why An — Yn — Zn- 

(b) Explain why {Y„)n converges if and only if it is bounded. Derive a similar result for {Zn)n- 

(c) Suppose now that X]„ Vn 7^ 00. Explain why Vn converges. 

(d) Let yn = y he the limit. Note that Zn ~ Xn — Yn- Explain why {Zn >n converges. 

(e) Conclude that {An)n converges, and thus that \xn\ converges. 

(f) Make sure that you understand why we have now obtained a contradiction from the assumption that 

y-n 7^ 00, and that you are able to provide a similar contradiction from the assumption X],j z„ 7^ —00 

□ 



60 



Further Results on Convergence of Series 



2.8.3 More Tests for Convergence 

In previous sections, we have obtained several results which guarantee convergence, e..g. 



Theorems 2.8.1 2.8.3 and 2.8.5 We derive here a few more results of that ilk. 



Theorem 2.8.19 (Comparison Test) 

Suppose that J2n^n is a convergent series and that |y„| < x„ for all n £ N (or merely 
eventually^. Then J^nVn converges absolutely. 



Proof: Let Sn be the n^^ partial sum of X]„|yn|- We shall show that {sn)n is a Cauchy 
sequence. So let e > 0. Since Xn converges, the sequence of tail series (X]fcLn+i ^k)n must 
converge to (by Proposition 2.8.8), and thus there is an such that Y1'^=n+i < ^■ 



Then if n > m > A^, we have 

\Sn - Sm\ = lUm+l + ym+2 H h y„| 

< \ym+l \ + \ym+2\ H h \yn\ 

< Xm+1 + XN+2 H \-Xn 

oo 

< ^ Xk < e 

k=N+l 

Hence {sn)n is a Cauchy sequence, and thus convergent. 



A minor modification, invoking Proposition 2.8.8 shows that the result remains true if 
we only have < a^n eventually. 



Theorem 2.8.20 (Root Test) 
Suppose that J2n " series in M.- 

(a) If there is a < 1 such that < a eventually, then X^^x^ is absolutely convergent. 

(b) If \xn\" > 1 infinitely often, then Y^^Xn diverges. 



Proof: (a) There is A^ G N such that 

\xn\ < a" whenever n > N 

Now since a < 1, the geometric series converges. By the Comparison Test, the series 

l^nl converges as well. 

(b) If > 1 infinitely often, then > 1 infinitely often, so certainly Xn -h 0- Thus 



^n^n cannot be convergent (by Corollary 2.8.2). 



Remarks 2.8.21 (1) Suppose that lim^ \xn\" ='■ a exists. If a < 1 then Xln-^^i cponverges 
absolutely. If a > 1, then diverges. 



Sequences and Series 



61 



(2) If lim„ \xn\" = 1, then the Root Test is inconclusive. Recall that — > 1 — see Exercise 



If Xr, 



,yn 



then limsup^ " = 1 and also lim 



sup„ \yr, 



1. Now ^j^Xr. 



diverges, whereas y„ converges absolutely. It follows that the Root Test is inconclusive 
if a = 1. 



□ 



Theorem 2.8.22 (Ratio Test) 

Suppose that x„ is a series with lim„ 

absolutely, and if a> 1, then X^^x^ diverges. 
If a = 1, the test is inconclusive. 



a 



. If a < 1, then J2n converges 



Proof: Suppose first that q < 1. Choose /3 such that a < /? < 1. Then 
i.e. there is G N such that 



< 13 eventually, 



Xn+l 



Xr, 



< P whenever n > N 



Then 



|xAr+i| < (3\xiy\ 

\xN+2\ < (3\xN+l\ < (3'^\xn\ 



\xN+n\ < (^''IxnI 

Now P'^\xn\ = \xn\ J2n converges, and thus by the Comparison Test, the series \xn\ 
converges as well. 

If a > 1, then > \xn\ eventually, so we cannot have Xn 0. Thus Xn diverges. 

H 

Examples 2.8.23 Let x G M. We show that the series 

n=0 

converges. We apply the Ratio Test. The ratio of successive terms is 

(n+l)! _ X 



Now, no matter how big x is, — > as n ^ oo. Since < 1, the Ratio Test guarantees 

that Yl^=o ^ converges absolutely. 

And you probably already know that it converges to e^. 

□ 



Exercise 2.8.24 Determine whether or not the following series converge. 



62 



Further Results on Convergence of Series 



(^) J2n=l 2ri~-l 

\") l^n=l (2n)! 

(c) En=ii-^r~'n-'^ 

(d) EZi ^ 

i^) Yln=2 Tilnn 

/r\ v^oo \/n+l — 



(g) EZin^-^ 

(h) Z^n=l (Inn)" 

□ 

Exercise 2.8.25 (a) Suppose that X^n -^^ Ylny"^ series with non-negative terms, 
and that |^ ^ / as n — > oo, where I ^ 0. Prove that "^^Xn converges if and only if 
Vn converges. 

(b) Hence determine whether or not the fohowing sequences converge: 



n 



2n - 1 ^ ' ^ 2n2 + 4 



[Hint:(a) If |^ — > /, then eventuahy + e) > Xn- It fohows that there is C > such 
that Xn < Cyn for all n. Now use Comparison.] 



□ 

Here is another result on convergence of series which is slightly ahead of time: Recall 
(from first-year calculus) that if / is continuous on the interval [a,oo), then the improper 
integral f{x) dx is defined by 

/•oo rb 

/ f{x) dx = lim / f{x) dx 

J a b^'^ J a 

whenever this limit exists. If this limit exists, the improper integral is said to converge. (We 
are ahead of time because we have not yet said what we mean by lim . At this stage, you are 

no doubt able to provide the definition yourself, however. ) 



Proposition 2.8.26 (Integral Test for Convergence) Suppose that f is continuous, de- 
creasing and non-negative on the interval [l,oo). Further supposes that Ylri=iyn 
series with yn = f{n) for n G N. Then the improper integral f{x) dx converges if and 
only if the series ^„ y„ converges. 



Exercise 2.8.27 We prove Proposition 2.8.26 



(a) Explain why yk+i < f{x) < yt whenever k < x < k + 1 (where A; E N) 



Sequences and Series 



63 



(b) Show that y^+i < J^'^^ f{x) dx < yk for all A; e N. 

(c) Hence show that 

^yk< f{x) dx<^yk 

k=2 •'^ k=l 

for all n G N. 

(d) Now suppose that f{x) dx converges, i.e. that lim f{x) dx exists and is < oo. 

6— >oo 

Explain why Y^k^iVk converges as well. 

(e) Next, assume that f{x) dx diverges. Show that Yl'^=i Vn diverges as well. 

(f) The proof of the proposition is now complete. Now use this result to determine whether 
or not the series 



oo ^ 

T — 



nlnn 

n=3 

converges. [Hint: Note that ^^=3 ^ = E^=i (n+2) in(n+2) 1 



□ 



2.9 lim sup and liminf* 

Suppose that is a hounded sequence in M. Construct two new sequences as follows: 

yn = sup{a;TO : m > n} Zn = '-^{xm : m > n} 

Because (x^) is bounded, yn and Zn exist (i.e. are finite real numbers), by the Completeness 
Axiom. 

Suppose, for example, that x„ = ^ ^' for n > 1. Then 

r 1111 11 
yi = sup -1,-,--,-,--,... =- 



2/2 = sup 
ys = sup 
y4 = sup 



1 11 1 ] 1 
2'~3'4'~5''" J ~ 2 
111 11 



3'4' 5'"J 4 

--- 1 = - 
4' 5''"| 4 



i.e. {yn) is the sequence \, \, \, \,\,\,\,.... 

Similarly, you can check that {zn) is the sequence — 1,|,— g,— |,— | 

Exercise 2.9.1 (a) Given {xn)n, write down the first 6 terms of y„ = sup{xj„ : m > n} and 
Zn = inf{xm : m > n}. 

(i) Xn = {-IT 

(ii) Xn = l 



64 



lim sup and lim inf 



(iii) Xn 



1 H if n is odd 

n 

-1 — 2^" if n is even 

,1 if n is odd 

(iv) Xn= { n 

-1 + 2~" if n is even 

(b) Note that each of the above sequences is decreasing, and that each of the (2n)n is 
increasing. Can you explain why? 

(c) Finally, since the (t/ri) and {zn) are bounded monotone sequences, they must converge (by 



Theorem 2.3.11 ). Write down lim„ y„ and lim„ z„ for each of the sequences in (a)(i)-(iv). 

□ 

As noted in the above exercise, is a decreasing sequence, and {zn) is increasing. To 
see this, let An = {xm : m > n}. Clearly 

AiDA2^A3D... 

Hence 

sup Ai > sup A2 > sup ^3 > . . . and inf Ai < inf A2 < inf A3 < . . . 

(Note that 'd AO B, then sup A < sup 5, and inf A > inf S.) 

Since y„ = supyl„ and Zn = inf j4„, and because sup^„ > inf ^4^, we see that 

Zl < Z2 < Z3 <■■■< Zn <...■■■< Vn <■■■< ys < y2 < yi 



Now any bounded monotone sequence converges (Theorem 2.3.11), and thus lim„ y„ and 
linin Zn exist if (x„) is bounded. We now define limsup„x„ = lim„y„, and liminf„x„ = 
lim„2:„: 



Definition 2.9.2 Let be a sequence 


in R. We define the limit superior of {xn) 


by 


lim sup Xn 


= lim sup Xm 




n 






where we adopt the convention that if (xn) 


is not bounded above, we set lim sup„ Xn = 


+00. 


Similarly, we define the limit inferior of by 




lim inf Xn 


= lim inf Xm 




n 


n— +00 m>n 




where we adopt the convention that if {xn) 


is not bounded below, we set lim inf „ Xn = 


—00. 



The notions of lim sup and lim inf are quite difficult, so we will approach them in another 
way. Let {xn)n be a sequence, and let ?/„ := sup^>„ and Zn := m(m>n Xm- 



• If limsup„x„ > a, then limy„, > a. Hence yn > a for all n (since {yn)n is decreasing). 
It follows that, for all n, sup^>„ Xm > «, i-e. that there exists m > n such that x^ > a. 
Thus 

lim sup Xn > a Vn G N 3m > n{xm > a) 



n 



Sequences and Series 



65 



I.e. 



lim sup Xn > a 



Xn > a infinitely often 



On the other hand, if Xn > a infinitely often, then '■= s^Pm>n > a for all n. Thus 
limsupnXn = limn?/n > o,, i-S- 



Xn^ CL infinitely often 



lim sup Xn > a 



• From the logical equivalence of </? — > V and -■V' we see that 

Xn < a eventually =^ lim sup Xn < a 

n 

and 



lim sup Xn < a 



Xn < a eventually 



• Since lim inf n Xn = supn inf„^>n x^ = — infn sup^>n(— x^) = — linisupn(— Xn) (because 
— sup^ = inf(— where —A := {—a : a G A}), we need to prove a result only 
for lim sup, in order to get immediately a corresponding result for lim inf. Similar 
statements therefore hold for lim inf. 

Summarizing in a box: 



lim sup Xn > a 

n 

Xn ^ d infinitely often 
Xn < a eventually 
lim sup Xn < a 

n 

lim inf Xn < a 

n 

Xn < a infinitely often 
Xn> a eventually 
lim iiif ;r„ > n 



Xn > a infinitely often 
lim sup Xn > a 

n 

lim sup Xn < a 

n 

Xn < a eventually 

Xn < a infinitely often 

lim inf Xn < a 

n 

lim inf Xn > a 

n 

x„ > a owntually 



If you understand the implications in the box, you understand lim sup and lim inf. 



Proposition 2.9.3 Suppose that {xn)n, {yn)n o,fe bounded sequences inW, and that A G 

(a) lim inf nXn < limsupnXn 

(b) // A > 0, then limsup„ Ax„ = AlimsupnXn, and liminfn Axn = A lim inf nXn 

(c) If X < 0, then lim supn Axn = A lim infn Xn, and lim infn Axn = A lim supn Xn 

( d) lim SUPn(Xn + Vn) < li™ SUPn Xn + Hm SUPn Vn 

( e) Hm inf ( Xn + Vn) > lim infn Xn + lim infn Vn 

(f) If Xn < Vn, then limsupnXn < limsupnJ/n o-nd lim infn Xn < lim infn yn 



66 



lim sup and lim inf 



Proof: Here's a proof of (a): Suppose that x := liminf„x,i, and that e > 0. Then 
hminf„Xn > x — e, so Xn > x — e eventually. Obviously then also Xn > x — e infinitely 
often, so that It follows that limsup^Xn > x — e. Since this holds for all e > we must 
have limsup„ x„ > x, as required 

The rest of this proposition is left as an exercise. 

H 



Exercise 2.9.4 Prove the remainder of Proposition 2.9.3 

[Hints: (c) If x„ > z infinitely often and A < 0, then Ax„ < Xz infinitely often. 

(d) If z > limsup„x„ and w > limsup„y„, then Xn < z eventually and yn < w eventually. 

Hence Xn + Un < z + w eventually.] 



□ 



Though a bounded sequence {xn)n may not have a limit, it always has a lim sup and a lim inf. 
When {xn)n does converge, the three notions coincide, and conversely, as we shall see next. 
Note that always limsup„x„ > liminf„x„: 

Proposition 2.9.5 Suppose that {xn)n is a bounded sequence of real numbers. Then {xn)n 
converges if and only i/limsup„x„ = liminf„x„. In that case, lim„x„ = limsup^x^ = 
lim inf„ x„. 

x\ < £ eventually, and thus in 
e. 

X — e < lim inf x„ < lim sup Xn < x + e 

^ n 

Since e was arbitrary, we must have liminf^Xn = limsup^Xn = x. 

(<=): Suppose that liminf^x^ = lim and let e > be arbitrary. Then 

lim inf „ > x — e, so x„ > x — e eventually. Similarly, limsup„ x„ < x + e, so x„ < x + e 
eventually. Combining, we see that x — e<x„<x + e eventually, i.e. that |xri — x| < e 
eventually. 

H 

Proposition 2.9.6 Every sequence of real numbers has a monotone subsequence. 

In fact every sequence has a monotone subsequence which converges to lim sup„ x„ ( and 

similarly, one which converges to liminf„x„J. 

Proof: Given a sequence {xn)m show that there is a monotone subsequence which con- 
verges to limsup„x„. Let x = limsup^x^, and put yn = sup^>„Xm (so that yn [ x). We 
distinguish two cases. 

Case 1: x < yn for all n. 

(We allow here the case x = —oo.) In that case, we can choose a decreasing subsequence 

■* What we are using here is that, ii a > x — e for all e > 0, then also a > x. For if not, then a < x. But if 
we now define e := then e > and a < x ~ e — contradiction. 



Proof: (=^): Suppose that x„ — > x, and let e > 0. Then |x„ — 
particular x„ < x + e eventually. Thus lim sup„ x„ < x + e. 

Similarly x — e < x„ eventually, and thus lim inf^ x^ > x — 

It follows that for all e > 0, we have 



Sequences and Series 



67 



{xnk)k inductively, as follows: Let A''! = 1. Since y^-^ > x, there is ni > Ni so that x„j > x. 
Next, since Un i x, there is N2 > ni such that < ^ni- Since, by hypothesis > x, there 
is n2 > N2 such that Xn2 > x also. Thus x < Xn2 < yN2 < Xm < VNi- 

Keep going in the same way: Once we have constructed a double sequence of integers 
Ni < rii < N2 < n2 < ■ ■ ■ < Nk < Uk such that 

Xni ^ Xn.2 ^ * * * ^ Xuf^^i ^ Xrij^ ^ X 

we may choose A'^fe_|_i > n/c so that yAf^.^^ < x^^.. Since also yAr^.^^ > x, there is > N^^i 
such that Xnj^^-^ > x. Thus n^+i > Nk+i > Uk and > yN^+i > a;„^^-^ > x. 

This completes the inductive construction of the subsequence {xn^.)k- Now y„ ^ ir, and 



so also the subsequence yN^ — *■ x, by Proposition 2.6.4 Since x < Xn^, < yNk for ^) t^is 



Sandwich Theorem ensures that ^ x as well. 
Case 2: There is A^o such that y^o = x. 

(We allow here the case x = +00.) In that case, since yn is a decreasing sequence converging 
to X, we must have y„ = x for all n > Nq also. In particular, it follows that x„ < x for all 
n ^ -^0- Thus either (i) x„ = x infinitely often, or (ii) x„ < x eventually. If (i) holds, there 
is obviously a constant (hence monotone) subsequence converging to x, so it remains to deal 
with (ii). 

Suppose therefore that Xn < x for all n > Ni, and let N = maxjA'^o, A''i}. Then 

yn > N {yn = X A Xn < x) 

Define f„ = x — - if x is finite, and put t„ = n if X is infinite. Then tn ^ x, and tn T x^ 
whether x is finite or not. Inductively construct an increasing subsequence x^j. as follows: 
Choose ni > N, so that ti < Xm < x. Now yni+i = x, because yn = x for all n > N, and so 
there is 712 > ni such that max{xni,t2} < x^a- Of course, also x„2 < x. Proceed in the same 
way: Given ni < n2 < • • • < such that 

max{x„^. , tj+i} < Xnj+^ < X for j = 1, . . . , - 1 

choose rifc+i > nt such that x^^.^^ > max{x„j., t^+i} — this is possible because yn,,+i = 
sup{xn : n > nfc} = x. 

In this way we obtain a strictly increasing subsequence {xnf.)k such that < Xn^ < x. 
Since x, we see that x„j. x also, by the Sandwich Theorem. 



68 



lim sup and lim inf 



Chapter 3 

Basic Topology 



3.1 Introduction 

The aim of this short chapter is to create a new language for talking about space. This 
language, couched in the terminology of sets, is extremely general, and applies to structures 
vastly different from M, although we will mainly apply it to the reals. We are going to define 
a large number of simple concepts, and state a large number of simple propositions. Indeed, 
all of the propositions in this section are trivial, in that they only require one to plug in the 
appropriate definitions to prove them. But those definitions take some getting used to! It 
is therefore extremely important that you do all the exercises — perhaps several times over! 
There is no other way to learn this new language. 

Most of the concepts we define will invoke only the notion of distance d(x, y) between two 
points. For example, in M, the usual distance between x, y is defined by 

d{x, y) := \x-y\ = \/(x - y)^ 

In M"-, the usual distance is given by d{x, y) := \/Y17=ii^i ~ UiY^ where x := (xi, X2, . . . , x„). 

Such a distance function d is called a metric, and the properties we require it to have are 
extremely simple: 



Definition 3.1.1 A metric space is a pair {X,d) consisting of a set X together with a 
map d : X X X — > M, called a metric, which satisfies the following conditions: 

(i) d(x, y) > for all x,y £ X; 

(ii) d(x, y) = if and only li x = y; 

(iii) d{x, y) = d{y, x) for all x,y £ X; 

(iv) d(x, z) < d{x, y) + d{y, z) for all x, y, z G X (Triangle Inequality); 



Exercise 3.1.2 Verify that the map d : M x M ^ R : (x, y) ^ |x — y| is a metric, i.e. that 



it satsfies (i)-(iv) in Defn. 3.1.1 Note that (iv) is equivalent to the usual triangle inequality 
in M. The triangle inequality has already been used many times in these notes to provide 
important estimates. It plays a similar role in the theory of metric spaces. 

□ 



69 



70 



Open and Closed Sets — Motivation 



Remarks 3.1.3 Many of the notions that we have studied so far carry over immediately to 
arbitrary metric spaces. For example: 

• Let {xn) be a sequence in a metric space {X, d). Then Xn ^ a; in the space X should 
mean that the distance between x„ and x converges to 0, i.e. that d{xn,x) 0. Since 
{d{xn,x))n is a sequence of (non-negative) real numbers, we have already defined what 
it means to say x) — 0, and thus we see that 

Ve > G N Vn > iV [d{xn, x) < e] 

This is exactly the definition of convergence in M when d is the usual metric d{x, y) := 

\x-y\- 

• As it stands, the Completeness Axiom does not make sense in arbitrary metric spaces, 

as it depends on the order relation on R. But we saw that one of the most important 
consequences of the Completeness Axiom in M is that Cauchy sequences converge. Now 
we can define the notion of Cauchy sequence in an abstract metric space (X, d): 

{xn) is a Cauchy sequence <^=^ Vsr > 3N G N Vra, m> N [d{xn, Xm) < s] 

We then define a metric space to be complete if and only every Cauchy sequence con- 
verges. We thus have a definition of completeness that is phrased in terms of distance, 
rather than order. 

□ 

We will not study general metric spaces in this course, but restrict our attention to M, 
with its usual metric. We will, however, phrase most of our definitions in terms of the metric, 
to emphasize the geometric flavour of our definitions. As a bonus, these definitions, as well 
as many proofs, carry over verbatim to more general spaces. 

Remarks 3.1.4 Furthermore, we shall have occasion to use it in a more abstract setting at 
least once, when we discuss uniform convergence of functions. Here the underlying space is 
not M, but the set of real-valued continuous functions C[a, b] defined on an interval [a, b]. The 
functions space C[a, b] comes equipped with a metric: 

d{f,g) ■■= sup \f{x)-g{x)\ 

xela,b] 

We say that a sequence of such functions (/„) converges uniformly to a function / if /) — 

0. This notion is extremely important for proving regularity properties of power series ex- 
pansions of functions, for example. Unraveling the definition, this means 

Ve > 03N e NVn > mx G [a, b] - f{x)\ < s] 

This would be the definition of uniform convergence if we don't use the notion of metric, and 
is rather more complicated. 



Xr, 



X 



in (X, d) 



□ 



Basic Topology 



71 



3.2 Open and Closed Sets — Motivation 

We already know what we mean by open interval and closed interval. We would like now to 
extend the definition of open and closed to more general subsets of M, and not just intervals. 
Intuitively, a subset [/ C M is open if it contains none of its boundary points, and closed if it 
contains all of its boundary points. For example, the interval (a, 6) has boundary points a, b, 
and these do not belong to it, so (a, b) is open. On the other hand, the interval [a, b] has the 
same boundary points, and they belong to it, so [a, b] is closed. An interval such as (a, b] is 
neither an open set nor a closed set. 

In the same way the set (1, 2) U (3, oo) is open, whereas the set (0, l)'^ is closed. 

This is all very well for simple sets such as intervals, where we can see the boundary points. 
But it won't wash for more complicated sets, because we have no definition of boundary point. 
What, for example, arc the boundary points of the set Q, or of the singleton set {0}? Clearly, 
a deeper analysis of these notions is needed. 

For this, it helps to consider the space {X, d) to be with its usual metric, because then 
we can visualize the concepts more easily by drawing pictures. Suppose, therefore that A is 
a non-empty subset of X. 



A 



■y 



A" 



We see here three points x, y, z. Using some standard notions in English (which have not 
yet been defined mathematically), we see that: 

• x is clearly "inside" A. We say that x is in the interior of A. 

• y is on the "outside" of A. This means that y is on the "inside" of A'^, i.e. y is in the 
interior of A"^. 

• z is on the boundary of A. It is also on the boundary of A^. It is neither in the interior 
of A nor in the interior of A'^. 

It is thus clear that if we can mathematically define what it means for a point to be 
"inside" a set A, then we can also define what it means for a point to be on the "outside" 
and on the boundary. We thus seek a mathematical definition for the notion of an interior 
point of a set. 

Now if a point x is truly "inside" a set A, then it should be possible to move a (perhaps 
very small) distance in any direction from x without leaving the set A. This means that there 
is some (perhaps very small) number e > such that if you move a distance of < e in any 
direction from x, then you stay inside A. Dispensing with the notion of direction, this means 
that any point which is sufficiently close to the point x also belongs to the set X. Let us now 
define the open ball of radius e centered at x by: 



B{x,£) := {x' e X : d{x,x') < e} 



72 



Open and Closed Sets — Motivation 



If by "sufficiently close" we mean "within a distance of e" this translates to the following: 
X is in the interior of A if and only if there is e > such that B{x, e) C ^ 

Remarks 3.2.1 • Obviously, when we are dealing with M equipped with its usual metric, 
an open "ball" is just an open interval: 

B{x, e) = {x — £,x + e) 

Conversely an open interval is an open ball: 

• In an open ball is a circle (not including its boundary) , and in it really is a ball 
(not including its boundary). 

• However, the notion of open ball makes sense in any metric space, because its definition 
uses only the notion of metric, i.e. distance: B{x,£) is the set of all points whose 

distance to the point x is < e. Hence the definition of interior point makes sense in any 
metric space, as it only uses the notion of open ball, and this only needs the notion of 
distance. 

□ 

Now that we have mathematically defined the notion of interior point, we can define the 
notion of boundary point. A point z e X is a boundary point of A if it is neither an interior 

point of A nor an interior point of A'^. So let's first analyze what it means to assert that z is 
not an interior point of a set A: It means that for no e > is it the case that B(z, e) C A. 
This, in turn, means that for every e > we must have B{z, e) Ci A'^ 0, i.e. every open ball 
centered at z must intersect A'^, i.e. must contain a point of A'^. [It may be helpful to recall 
that B CC if and only if B n C"^ = 0.] 

Since a point z G X is a boundary point of A if it is neither an interior point of A nor an 
interior point of A'^, we can now assert that 

z is a boundary point of A if and only every open ball centered at z must intersect both A and A*^. 

Note that — since {A'^Y = A — this definition is symmetric in A, A'^. Thus if z is a boundary 
point of A, then z is also a boundary point of A'^, and vice versa. 

Now recall that we want a set ^ C X to be open if A contains none of its boundary points. 
We want a set A C X to be closed if A contains all of its boundary points. If we analyze this, 
it turns out that we can dispense with the notion of boundary point altogether! 

• A is open iff for every x G A it is the case that x is not a boundary point of A. 

— This means that for every x E A there is some open ball B{x,e) > such that 
either B{x, e) Pi = or B{x, e)nA = $. 

— But it is impossible that B(x, e) n ^ = 0, since x & A and x G B{x, e). 

— We thus see that for every x & A there is e > such that B{x, e) Ci A'^ = ili. 



Basic Topology 



73 



— Hence for every x ^ A there is e > such that B{x, e) C A. [Again recall that 
5 C C if and only if 5 n = 0.] 

— But this means exactly that every x G vl is an interior point of vl (!!) 
Thus A C. X is open if and only if every point of A is an interior point of A. 



• A C. X is closed if every boundary point of A belongs to A. 



— This means that every boundary point of belongs to A, because A and A'^ have 
the same boundary points. 

— Hence no boundary point of A'^ belongs to A'^. 

— And hence A'^ is open (!!). 



Thus ^4 C X is closed if and only if A is open. 



We built on the intuition provided by open and closed intervals to define open and closed 
sets is any metric space. For this, we needed the notion of boundary point. But when we 
finally analyzed the definitions of open and closed set, it turned out that boundary points 
were unnecessary — all wc need is the notion of interior point. A set is open precisely if all 
of its points are interior points. A set is closed if its complement is an open set. 



Ac X is open 



Vx G A3e > 0[B{x,e) C A] 



□ 



In the next section we will write down all these definitions again, without the motivation 
provided here, and proceed to deduce some simple consequences from these definitions. 



74 



Open and Closed Sets — Definitions and Basic Properties 



3.3 Open and Closed Sets — Definitions and Basic Properties 

3.3.1 Definitions 
Definition 3.3.1 Let {X,d) be a metric space. 

(i) For xq G X and r > 0, we define the open ball of radius r centered at xq by 

B{xo,r) := {x E X : d{xo,x) < r} 

(ii) Let A C X. We say that a point x G X is an interior point of A if and only if there 
is an open ball centered at x which is contained in A. We denote the set of interior 
points of A by A°. i.e. 

xeA° 3r > 0[B{x, r) C A] 

Note that A° C A. 

(iii) We say that ^ C X is a neigbourhood of x G X if and only if x is an interior point 
of ^: 

A is a neighbourhood of x <^=^ x £ A° 

(iv) We say that A CI X is and open set if and only if every clement of A is an interior 
point, if and only if ^ is a neighbourhood of each of its elements: 

A is open <j=^ A° = A 

(v) We say that ^ C X is a closed set if and only if its complement A'^ is open. 

(vi) We say that x £ X is a boundary point of A C X if and only if every open ball 
centered at x has non-empty intersection with both A and A'^, if and only if x is 
neither an interior point of A'^, nor an interior point of A. We denote the set of 
boundary points of A by dA: 

xedA Vr > 0[B{x, r) n A / A B{x, r)nA''^ 0] 

Exercise 3.3.2 Let us see what these notions mean in M, i.e. let {X, d) be M with the usual 
metric d{x,y) := |x — y|: 

(a) Show that every open ball is an open interval, and vice versa. 

(b) Show that every open interval is an open set. 

(c) Find an open set which is not an open interval. 

(d) Show that every closed interval is a closed set. 

(e) Show that [0, 1]° = (0, 1). Deduce that [0, 1] is not an open set. 

(f) Show that (0, 1] is neither open, n or closed. 

(g) Show that Q° = 0. 



Basic Topology 



75 



(h) Show that and M are both open sets, and also that they are both closed sets. 

□ 

3.3.2 Open Sets 

We have A° C A, and thus 0° = 0. It follows that is always open, in any metric space. 



Proposition 3.3.3 Let {X, d) he a metric space. 

A set is open if and only if it is a (possibly infinite) union of open balls. 



Proof: (=^=-): Suppose that ^ C X is an open set. 11 a £ A, then a is an interior point of A, 
and thus there is > so that B{x, Tq) C A. It follows easily that A = IJagA -^(^' '''a), and 
thus that ^4 is a union of open balls. 

(■^=): Suppose that A = IJie/ ^ union of open balls Bi, where Bi := B{xi,ri) is the open 
ball with center Xi and radius > 0. We want to show that A is an open set, i.e. that each 
point of A is an interior point of A. So let a G A be an arbitrary point. Then there is some 
j € / so that a £ Bj, and thus d(a, xj) < rj. Choose e > sufficiently small so that d{a, Xj)+e 
is still < Tj. (E.g. take e := ^{rj — d{a, Xj). Then d{a, Xj) + s = ^{d{a, xj) + rj) < ^{rj + rj) 
— but draw a picture!) We now claim that B{a,£) C B{xj,rj): For if z G B{a,£), then 
d{z, a) < £, so 

d{z, Xj) < d{z, a) + d{a, xj) < e + d{a, Xj) < rj 

and hence z £ B{xj,rj). It now follows that a is indeed an interior point of A: B{a,e) C 
B{xj,rj) C IJ-gj Tj) = A. Since a was an arbitrary point of A, we conclude that every 
point of A is an interior point of A, and thus that A is an open set. 



An open ball is a "union" of a family containing just one open ball — namely itself. Hence 
we may conclude that open balls are open sets. The empty set is also a "union" of a family 
of open balls — namely no open balls — and hence the empty set is open. 

Restricting to M, it follows that set yl C M is open if and only if A is a union of open 
intervals. 

Proposition 3.3.4 Let (X, d) be a metric space. A subset A <Z X is open if and only if 
it contains none of its boundary points, i.e. 

A is open <^=^ AndA = fl> 



Exercise 3.3.5 Prove Proposition 3.3.4 



Hint: Suppose that A X, and that x £ A. Show that x is an interior point of A iff it is not 
a boundary point of A, i.e. that 

x£A° ^ x^dA 

□ 



Here are some important properties of the family of open sets in a metric space: 



76 



Open and Closed Sets — Definitions and Basic Properties 



Proposition 3.3.6 Let {X, d) he a metric space. The family of open sets satisfies the 
following axioms: 

(T.l) X and are open. 

(T.2) The union of any (possibly infinite) collection of open sets is open. 
(T.3) The intersection any finite collection of open sets is open. 



Exercise 3.3.7 Prove Proposition 3.3.6 



□ 

Remarks 3.3.8 The above axioms form the basis for a further level of abstraction, i.e. a 
level even more general than metric spaces: A topological space is a set X equipped with a 
family O of subsets of X — called the open sets — satisfying T.l, T.2, T.3. The subject of 
topology studies topological spaces (and the continuous maps between them) . 

□ 

Here is a nifty topological criterion for convergence which does not mention e > 0: Recall 
that if {xn) is a sequence in X, then we define convergence in X by 



Proposition 3.3.9 Let {X, d) he a metric space, and suppose that is a sequence in 
X, and X & X . Then Xn ^ x if and only if given any neighbourhood U of x, we have 
Xn £ U eventually, i.e. for any neighbourhood U of x there is N such that Xn £ U for all 
n>N. 

Proof: Recall that [/ is a neighbourhood of x if and only if there is e > such that 
B{x, e) C U. Then B(x, e) is itself a neighbourhood of x. 

Now suppose that x„ x, and that [/ is a neighbourhood of x with B(x, e) C U. Then there 
is N such that d{xn, x) < e for all n > N. Hence Xn S B{x, e) for all n > N, and thus Xn & U 
for all n > N, i.e. Xn &U eventually. 

Conversely, suppose that Xn eventually belongs to any neighbourhood of x. Let e > 0. Then 
Xn G B{x, e) eventually, so there is N such that x„ G e) for any n > N . This means 
that for any e > there is N such that d{xn,x) < e for all n> N , i.e. that d(x„, x) — > 0. 



3.3.3 Closed Sets 

Recall that a subset A of a metric space is defined to be closed if and only if its complement 



A'^ is an open set. Using de Morgan's laws and Proposition 3.3.6 it is easy to prove the 
analogues of (T.1)-(T.3) for closed sets: 



Proposition 3.3.10 Let X be a metric space. 

1. X and are closed. 

2. The intersection of a (possibly infinite) collection of closed sets is closed. 

3. The union of finitely many closed sets is closed. 



Basic Topology 



77 



Exercise 3.3.11 Prove Proposition 3.3.10 by combining Proposition 3.3.6 with De Morgan's 
Laws. 

□ 

We now show that closed means closed under limits: 



Proposition 3.3.12 Suppose that {X,d) is a metric space, and that C C X. Then the 
following are equivalent: 

(i) C is closed. 

(a) Whenever {cn)n is a sequence in C which converges, then it converges to a point in 
C, i.e. 

Cn € C and Cn ^ x implies x £ C 



Proof: (i) =^=- (ii): Suppose C is closed in X, and that (c„)„ is a sequence in C which 
converges, i.e. Cn — > x. We must show x £ C, and we argue by contradiction: li x ^ C, then 
X E C^. Since C is closed, C"^ is open, so there is e > so that B{x, e) C C^. Since c„ — > x, 
we have d{cn, x) < e eventually (i.e. 3N G N Vn > [d(c„, x) < e]). Then c„ G B{x, e) C C" 
eventually, contradicting the assumption that Cn & C for all n. 

(ii) ^ (i): We prove the contrapositive, i.e. we show that -i(i) =^ ~'(ii): Suppose that C is not 
closed, i.e. that is not open. Then there is a point x in which is not an interior point 
of C^. Thus for every r > we have B{x, r) ^ C^. Let r„ > be real numbers such that 
^'n — > (e.g. define r„ := ^). Since B{x,rn) % C"^, we must have B{x,rn) n C 7^ 0. Choose 
therefore, Cn G B{x, r„) n C. Then (cn) is a sequence in C, and d{cn, x) < rn, so Cn x. Yet 
x^C. 



Proposition 3.3.13 Let {X,d) be a metric space. A subset C C X is closed if and only 
if it contains allCof its boundary points, i.e. 

A is closed <^=^ dC C C 



Exercise 3.3.14 Prove Proposition 3.3.13 



[Hint: Note that C and C"^ have the same boundary, i.e. d{C'^) = dC. Now apply Proposition 



3X4] to deduce that C is closed ^ C is open ^C^^^ d{C^) = ^ n 5C = ^ 5C7 C C] 

□ 



Definition 3.3.15 Let {X, d) be a metric space, and A X , xq £ X . We say that xq is 
a cluster point of A if and only if for any r > there \s a £ A such that < d{xQ, a) < r, 
i.e. iff for every r > 0, there is a £ B{xq, r) D A such that a 7^ xq. 
If xo G ^, but is not a cluster point of A, it is said to be an isolated point of A. 

Note that a cluster point of a set A need not be an element of A. Note also that imposing 
the condition < d{xo, a) < r is equivalent to saying that a, xq are "close" (within r of each 
other), but not equal. 



78 



Open and Closed Sets — Definitions and Basic Properties 



Examples 3.3.16 1. is the only cluster point of the set {-^ '■ n £ N}. Each element of the 
set is isolated. 

2. X is a cluster point of the interval (a, b) if and only if x € [a, b]. Thus (a, b) has no isolated 
points. 

3. Each X G M is a cluster point of the set Q. 

4. The set Z has no cluster points in M. 

5. A finite subset of M has no cluster points, i.e. each element is isolated. 

6. If x„ — > X and x„ 7^ x infinitely often, then x is a cluster point of the set {x„ : n G N}. 

□ 

Exercise 3.3.17 (a) Find all the cluster points of the following subsets of M (equipped with 
the usual metric): 

A:={0,1) B:=[0,1] C:=(0,l)n{2} D := : n G N} E := Q 

(b) Find the boundaries dA, . . . , dE of the sets A, . . . ,E above. 

(c) Find the closures ^, of the sets A, . . . , E above. 

□ 



Proposition 3.3.18 Let {X,d) be a metric space, and let A C X . A point x & X is 
cluster point of A if and only if there is a sequence (a„,) with distinct terms (i.e. n ^ m 
implies an / am) such that an — > x. 



Proof: Suppose that x is a clusterpoint of A. Then there is ai € ni?(x, 1), such that ai 7^ x. 
Let ri := min{d(x, ai), 1} > 0. Then there is 02 G A f] B(x,ri) such that 02 7^ x. Then 
d{x,a2) < ri = d{x,ai), so 02 7^ ai. Next, let r2 := min{(i(x, 02), 5} > 0. Then there is 
as E AriB{x,r2) such that as 7^ x. Then d{x,a2) < r2 < ri, so as 7^ a2,ai. 

Proceed inductively: Suppose we have found ai, . . . , a„ with d(x, aj) = rj, where 1 > ri > 
r2 > • • • > r„_i > 0. Let r„ := min{(i(x, a^), ^}. Then there is a„+i G Af) B{x,rn) such that 
an+i 7^ X. Thus < d{x, a„+i) < r„ < r„_i < • • • < ri < 1, so a„+i {ai, a2, . . . , a„}. 

In this way we obtain a sequence {an) of distinct elements of A. Moreover, d(x, a„) < 
fn^}l for each n, and hence (i(x, a„) — > 0, which means a^ — > x. 



Proposition 3.3.19 Let X be a metric space. A subset of X is closed if and only if it 
contains all its cluster points. 



Exercise 3.3.20 We prove Proposition 3.3.19 Let {X, d) be a metric space. Recall that the 
following are equivalent for a set C C X. 



(i) C is closed. 



Basic Topology 



79 



(ii) C contains all its boundary points. 

(iii) Whenever (c^) is a sequence in C which is convergent, then lim„ c„ G C. 

(a) Suppose first that C C X is closed, and that 2; is a cluster point of C. Explain why there 
is a sequence (c„) in C such that — > x. Conclude that C contains all its cluster points. 

(b) Show that if A C X, and if x G X is a point such that x G dA but x ^ A, then a; is a 
cluster point of A. 

(c) Conclude that if a set contains all its cluster points, then it contains all its boundary 
points, and hence is closed. 

□ 

f 

Exercise 3.3.21 Let X := C[0, 1] be the set of all continuous functions [0, 1] — > M. 

(a) Define d:XxX^Rhy 

dif,g) := sup{|/(a;)-5(x)|:xG[0,l]} 

(i) Show that d is a metric on X. [You may assume that supo<2,<i \f{x) — g{x)\ is 
always finite when /, g are continuous — we will prove this later.] 

(ii) Hence describe the open ball B{x'^, 1) of radius 1 centered on the function y = x^ 

(restricted to [0, 1]). 

(iii) Show that fn^fi^^ {X, d) if and only if 

Ve > 3Ar Vn > AT Vx G X [|/„(x) - f{x)\ < s] 
If fn ^ f in (X, d), we say that (/„) converges uniformly. 

(b) Define d:XxX^Rhy 

dif,g) := (^j\f{x)-g{x)\^dxy 

It can be shown that d is a metric. Describe the open ball -6(0, 1) of radius 1 centered at 
the constant function 0. 

□ 

3.4 Compact Sets 

Compactness is one of the most important notions in analysis and topology. Yet it is very 
difficult to explain where the definition comes from. In some ways, compactness as a general- 
ization of finiteness. Because the notion is so unfamiliar, we will restrict ourself to M, rather 
than general metric spaces. 



80 



Compact Sets 



Definition 3.4.1 We say that a subset C M is sequentially compact if and only if it 
has the following property: Every sequence (xn) in K has a subsequence {xn^)k such that 

(i) {xni.)k is convergent, and 

(ii) linifc Xn^ G K. 



Exercise 3.4.2 (a) Show that every finite subset of M is sequentially compact. 

[Hint: Explain why any sequence in a finite set must have a constant subsequence.] 

(b) Show that M is not sequentially compact. 

(c) Show that (0, 1) is not sequentially compact. 

(d) According to the Bolzano-Weierstrass Theorem, every bounded sequence in M has a 
convergent subsequence. Use this to show that every closed and bounded subset of M is 
sequentially compact. 

(Note that any finite subset of M is closed and bounded, so (d) implies (a).) 

□ 



Theorem 3.4.3 Let X C M. The following are equivalent: 

(i) K is sequentially compact: Every sequence in K has a subsequence that converges to 
a limit that is also in K. 

(ii) K is closed and hounded. 

Proof: (i) (ii): Recall first that if a sequence converges, then so does every one of its 
subsequences, and to the same limit. Furthermore, a set is closed if it is closed under limits. 

Suppose now that every sequence in K has a subsequence which converges to a limit in 
K. If ik n) is a sequence in K with kn — > 2;, then every subsequence of (fen) converges to x 
also. Hence x € K, i.e. K is closed under limits, and thus closed. 

It is straightforward to show that K is also bounded: If not, we could find , for every 
n ^ N, a kn £ K such that |/c„| > n. Then no subsequence of (fe„) converges at all. 

(ii) ^ (i): This is Exercise 3.4. 2[ d): Suppose that K is closed and bounded, and let {k) 



n) 



be a sequence in K. By the Bolzano-Weierstrass Theorem, any bounded sequence has a 
convergent subsequence, so (fe„) has a convergent subsequence (fenj)j- Since K is closed, it is 



closed under limits, i.e. lim^ fe„^. G K. 



Definition 3.4.4 (a) Let ACM. An open cover of A is a family {Ui : i G 1} of open sets 
in M such that 

(b) We say that a subset X C M is compact if and only if it has the following property: 
Whenever U = {Ui : i G /} is a family of open sets such that K C IJ^^jUi, there 
exists a finite subfamily C/j^, . . . , [/j,^ G U such that K C |J^^^ C/jj,. 
Succinctly put: K is compact if and only if every open cover of K has a finite subcover. 



Basic Topology 



81 



Examples 3.4.5 (a) Every finite subset of M is compact — why? 

(b) The set M (with the usual metric) is not compact. For example, if Un = -8(0, n), then 

{Un '■ n N} is an open cover of R". Yet it clearly has no finite subcover — why not? 
The same argument shows that no unbounded subset of can be compact, i.e. compact 
subsets of M" are necessarily bounded. 

(c) No open interval (a, 6) is compact in M: Let Un = {a + — Then clearly (a, 6) = 
(J^ Un (i.e. {Un}n is an open cover of (a, 6)). Yet {Un}n clearly has no finite subcover of 
(a, b) — why not? 

□ 

Exercise 3.4.6 We prove that the closed unit interval [0, 1] is a compact subset of M. 

(a) Let I = [0, 1] be the closed unit interval, and let U = {Uj : 7 G F} be an open cover of 
I. Define I* to be the set of all those x G / for which [0, x] can be covered by a finite 
subfamily of U : 

i* = {xe [0, 1] : 371, ... ,7m e r ([0,x] c c/^^ u • • • U U^J} 

(b) Explain why G /*. 

(c) Show that /* is a subinterval of /: If x G /* and <y < x, then y e I*. 

(d) Define x* = sup/*. Explain why < x* <1. 

(e) Explain why there is 7* G F such that x* G U^* . 

(f) Explain why x* G /*. 

(g) Assume now that x* < 1. Explain why there is £ > Osuchthat [x* —e,x*+£] C i7-y*n[0,l]. 

(h) Explain why [0, x* + e] can be covered by a finite subfamily of U. 

(i) Conclude that x* + e e I*. 

(j) Explain why this is a contradiction. 

(k) Deduce that 1 G /*, and thus that / can be covered by a finite subfamily of U. 

□ 

The above exercise can easily be generalized to show that any closed interval [a, b] in M is 
compact. 

Theorem 3.4.7 Let K CR. The following are equivalent: 
(i) K is sequentially compact, 
(a) K is closed and bounded. 
(Hi) K is compact. 



82 



Compact Sets 



Proof: (i) <^ (ii) is Theorem 3.4.3 



(iii) =^ (ii): Suppose now that every open cover of K has a finite subcover. We first show 
that K is bounded. Let r > 0, and define Ux = {x — r, x + r). Then {Ux '■ x G K} is an 
open cover for K (since ii y £ K, then y £ Uy CI IJ^g^^ Ux)- By hypothesis, there is a finite 
subcover, i.e. there are xi, . . . ,Xn € K such that K C U"^^ Ux^- Hence K is contained in a 
union of finitely many open intervals, each of finite length, which clearly implies that K is 
bounded. 

Next, we show that K is closed, i.e. closed under limits. So suppose that is a 
convergent sequence in K, and that y„ — > y. We must show that y £ K. Now if y ^ K, 
define an open cover of K as follows: For each x £ K, let Vx := — y\ be half the distance 
between x and y, and define Vx := B(x, Vx) = {x — rx, x + r^) to be the interval of radius 
centered at x. Then {Vx : x £ K} is clearly an open cover of K. By assumption, there exist 
xi, . . . , Xm £ K such that K C UJLi ■ Define e := mm{rxj : j = 1, • . . , m}. Then e > 0. 
Note that 'd x £ K, then |x — y| > e: For x £ K, then there is j such that x £ Vx , i.e. such 
that |x — < rxy Using the fact that |a — 6| > |a| — |6|, we see that 

\x - y\ = \ {x - Xj) - (y - Xj)\ > \y - Xj\ - \x - Xj\ > \y - Xj\ - rx^ = rxj > e 

Now since {yn) is a sequence in K, we have \yn — y\ > £ for all n. Hence it is impossible that 
Vn ^ y — contradiction. Hence the if above leads to contradiction, and we may conclude 
that y £ K, as required. 

(i) ^ (iii): Suppose that C R is such that every sequence in K has a subsequence that 
converges to a limit in K. Let U := {Ui : z S /} be an open cover of K. Define a function 

X R by 

fix) := sup{r :U £U: B{x, r) C U} 

Since U covers K, each x £ K belongs to some U £lA. Since this U is open, x is an interior 
point of U . Hence f{x) > for each x £ K. But we can say more: 

inf/(x)>0 (*) 

For suppose (*) is false. Then there exists a sequence {yn) in K such that /(yn) — > — just 
pick yn £ K so that /(y^) < ^- By assumption, (y„) has a convergent subsequence {zn) such 
that z := lim„ Zn £ K. Then {f{zn) is a subsequence of the convergent sequence {f{yn)) and 
hence f{zn) also. Since z £ K, there is U £ U such that z £ U. Let r > be such 
that B(z,r) C U. Then Zn £ B(z, ^) eventually. Then B(zn, §) ^ B{z,r) C U, and hence 
/(•Zn) > § eventually. This contradicts f{zn) 0. Hence (*) holds. 

With (*) now proved, we can proceed: Choose c so that < c < inixt^K f{x) and let 
xi £ K he arbitrary. If possible, choose inductively Xn+i £ K so that |x„+i — Xj\ > c 
for all j = 1, . . . ,n. If this is possible for all n, one would obtain a sequence {xn)n in -ft^ 
with \xn — Xm\ > c for all n,m, and such a sequence cannot have a convergent subsequence. 
Hence there is n for which it is impossible to choose Xn+i, and hence K C IJj'=i -^(^^j) c). 
By definition of c there exists a C/j. £ U so that B(xj,r) C f/j. for j = Thus 

C U""^^ C/jj. yields a finite subcover. 



Remarks 3.4.8 The fact that a compact set in R is the same as a closed and bounded set 
is called the Heine-Borel Theorem. 



Basic Topology 



83 



□ 

Here comes the Completeness Axiom again: 



Theorem 3.4.9 Suppose that Ki 5 K2 71 71 ■ ■ ■ is a decreasing sequence of non- 
empty compact subsets o/M. Then 7^ ^■ 



Exercise 3.4.10 We prove Theorem |3.4.9 using the fact that a set is compact if and only 



if it is closed and bounded. Suppose that Ki 71 K2 71 71 ■ ■ ■ is a decreasing sequence of 
non-empty compact subsets of M. 

(a) Let Un := supi^„ and /„ := inf ET^- Explain why Un,ln £ Kn- 

(b) Show that h < h ^ ■ ■ ■ ^ ^ ' ' ' ^ Un < ■ ■ ■ < U2 < ui. 

(c) Deduce that / := lim„/„ exists. 

(d) Show that the tail sequence {ln)n>m = Im, lm+i,lm+2, ■ ■ ■ is a convergent sequence in Km, 
and deduce that I G Km, for all m. 

(e) Hence conclude that Km / 0. 

□ 

Exercise 3.4.11 (a) Give an example of a decreasing sequence of non-empty closed sets 
C„ C M so that n„ Cn = 0. 

(b) Give an example of a decreasing sequence of non-empty bounded sets An C M so that 

n„^n = 0. 

□ 



Exercise 3.4.12 We prove Theorem 3.4.9 again, this time using the definition of compact- 
ness. Suppose that Ki ^ K2 71 K3 3 . . . is a decreasing sequence of non-empty compact sets 
in M, but that fl^i = 0- We seek to obtain a contradiction. 

(a) Define C/„ := K^ for n G N. Explain why : n G N} is an open cover of Ki. 

(b) Note that t/i C [/a C C/3 C . . . . Conclude that there is iV G N such that Ki C C/^. 

(c) Hence deduce that Ki D K^ = 0, and explain why this is a contradiction. 

□ 



84 Compact Sets 



Chapter 4 

Limits of Functions and Continuity 



4.1 Limits of Functions 

We have already defined the concept of limit for sequences, i.e. we know what is meant by a 
statement of the form lim x„ = Z. Our aim in this section is to give a similar definition for 

n— >oo 

lim f(x) 
x-*xo 

where f : X ^M. and X C M Later in this section, we shall also look at left limits, and right 
limits, concepts which employ the fact that M is also an ordered set. 

Let X C M, and let / : X — M be a function. We want to investigate what we mean when 
we say that 

lim f{x) = yo 
x-*xo 

where x,xo € X, and yo € R. Clearly the intention (i.e. the intuitive content) of the above 
statement is that as x "gets closer and closer" to xq, f{x) "gets closer and closer" to yo- We 
are not interested in what happens to the value of / at the point xq: We already know what 
it is — the value of the function there is simply f{xo). We are interested in what happens to 
the values of / as we get closer and closer to xq, without actually allowing x to be equal to 
xq. Indeed, it is quite possible that xq X, so that /{xq) is not even defined. 

As before, when we discussed limits of sequences, we run into the problem of not having 
a definition of "close" . But we already know how to circumvent this problem. For sequences, 
the statement lim„ Xn = I meant that Xn "gets closer and closer" to Z as n gets bigger and 
bigger. We conceptualized the notion of "closer and closer" as follows: 

lim„a:„ = I provided that, given any distance e > 0, Xn and 
I lie within e of each other whenever n is sufficiently large. 

We employ the same trick here: 

lim f{x) = yo provided that, given any distance £ > 0, the 

X — >X(} 

points f{x) and yo lie within e of each other whenever the 
points X and xq are sufficiently close (but not equal) in M. 

We must therefore find, for any distance e > 0, a distance S > such that if x, xq lie within 
6 (but are not equal to each other), then f{x), yo lie within e. 



85 



86 



Limits of Functions 



Thus we define the meaning of hm f(x) = yo as follows: 

X—*Xo 

For any e > there is a 5 > such that if < |x — xo| < 5, then \ f{x) — yo\ < £ 

For this definition to make sense, we must be able to find x ^ X which are arbitrarily close, 
but not equal, to xq. This is possible precisely when xq is a cluster point of X. We recall here 
the definition. 

Definition 4.1.1 Let X C M, and let xq G M. We say that xq is a cluster point of X if 
and only if for any S > there is x e X such that < \x — xo\ < S. 
If xo G X is not a cluster point of X, it is said to be an isolated point. 

Note that a cluster point of a set X need not be an element of X. Note also that imposing 
the condition < |x — .tqI < 5 is equivalent to saying that x, xq are "close" (within S of each 
other), but not equal. For revision, we recall some examples: 

Examples 4.1.2 1. is the only cluster point of the set : n G N}. Each element of the 
set is isolated. 

2. X is a cluster point of the interval (a, b) if and only if x G [a, b]. Thus (a, b) has no isolated 
points. 

3. Each X G M is a cluster point of the set Q. 

4. The set Z has no cluster points in M. 

5. A finite subset of M has no cluster points, i.e. each element is isolated. 

6. If x„ ^ X and x^ 7^ x infinitely often, then x is a cluster point of the set {x^ : n G N}. 

□ 

Definition 4.1.3 Let / : X — M, let xq G M be a cluster point of X, and and let yo G M. 

We say that 

lim /(x) = yo or /(x) ^ yo as x ^ xq 
x—*xo 

if and only if: For every e > 0, there is a (5 > such that 

|/(x) — yo\ < £ whenever x e X and < |x — xo| < ^ 
In logical notation: 

lim /(x) = yo Ve > 3(5 > Vx G X [0 < |x - xo| < 5 |/(x) - yo\ < e] 

X—tXQ 

Remsirks 4.1.4 • For the symbol lim to make sense, it is necessary that xq be a cluster 

X — >Xq 

point of the set X. However, it need not belong to X. 
• Note that S plays the same role in the definition of lim /(x) as does the N in the 

X — >Xo 

definition of lim„x„. In particular, 6 usually depends on e: The smaller we choose £, 

the smaller we will have to take 5. 
Moreover, S will usually also depend on xq. 



Limits of Functions and Continuity 



87 



• We can define the notion of limits for functions between arbitrary metric spaces: If 
{X, dx) and (Y, dy) are metric spaces, and / : X — > y, then we say that lim f{x) = yo 

if and only if 

Ve > 3(5 > Vx [0 < dx{x, xo) < S dy (/(x), yo) < s] 

i.e. f{x) can be made as close to yo as you like (in the space {Y, dy)) by taking x to be 
sufficiently close, but not equal, to xq (in the space {X,dx))- 

□ 

Examples 4.1.5 1. Let / : M ^ M : x hh- x^. Then 

lim f{x) = 4 

For suppose £ > 0. We must find a 5 > such that 

\x'^ — 4| < £ whenever \x — 2\ < S 

Choose S = min{|,l}. Then if |x — 2| < 5, then also |a; — 2| < 1, which implies that 
1 < a; < 3, and thus |x + 2| < 5. Now |a: — 2| < 6, we have 

- 4| = |x + 2| • |a; - 2| < 5|a; - 2| < 55 < £ 

as required. 

This example makes it clear that 6 (usually) depends on £. 

2. Let / : M ^ M : X x^, and let xq G M. Then 

lim f{x) = Xq 

X—tXQ 

For suppose £ > 0. We must find a 5 > such that 

|x^ — XqI < £ whenever |x — xo\ < S 

Choose S = min{jq:|j^, 1}. Then if |x — xo| < 6, then also |x — xo| < 1, which implies 

that |x + XqI = |x — Xo + 2xo| < |x — xo| + 2|xo| < 1 + 2|xo| Now if < |x — xo| < S, we 
have 

|x^ — XqI = |x + xo| • |x — xo| < (1 + 2|xo|)|x — Xo| < (1 + 2|xo|)(5 < £ 
as required. 

This example makes it clear that 5 (usually) depends on both e and xq. 

3. Let / : M ^ M be defined by 

J if X < 
~ I 1 if X > 

Let us consider lim /(x). From the graph of /, it is clear that if x is close to and x > 0, 

then /(x) = 1. On the other hand, if x is close to 0, but x < 0, then /(x) = 0. This 
suggests that lim /(x) does not exist. 

X—^0 



88 



Limits of Functions 



We now prove that, indeed, lim f{x) does not exist. For suppose that yo G M, and that 

a;— »0 

hm f{x) = yo- Let e = I. Then we should be able to find a (5 > such that — yo| < k 

x—^O 

whenever < < S. In particular, |/(|) — yo\ < ^ and |/(— |) — yo\ < ^, as | it || < (5. 
Thus 

1 = 1/(1) - < - vol + \yo - <h + l = i 

i.e. 1 < 1, a contradiction. 
4. Define 

' if X 7^ 



1 if X = 



Then it is easy to see that lim f{x) = 0. Indeed, given e > 0, let S be any non-zero number 

a;— >0 

whatsoever. If < l-x] < 5, then x / 0, so /(x) = 0, and so |/(x) — 0| < £, as required. 

Note that lim /(x) = 0, whereas /(O) = 1. 
x— >o 

The point is that the value of a function at a particular xq may be totally unrelated to the 
behaviour of the function as x ^ xo- 

□ 

The following important proposition shows that the the definition of the limit of a function 
at a point xq can be rephrased in terms of a limits of sequences. 



Proposition 4.1.6 Suppose that X C 


R, that f : X -> 


M and that yo & R 


. Then the 


following are equivalent: 








(a) lim /(x) = yo 








X—^Xo 








(h) Whenever {xn)n is a sequence in X 


such that 






(i) lim„x„ = Xq in M; 








(a) Xn ^ Xq for all n G N; 








then we have lim„/(xn) = yo- 









Proof: (a) (b): Suppose that Hm /(x) = yo, and that (x„)„ is a sequence in X satisfying 

a;— >a;o 

(i) and (ii) above. We must show that /(x„) — yo, i.e. we must show that 

For all e > there is N such that |/(x„) — yo\ < £ whenever n > N 
So let £ > 0. Because lim /(x) = yo, there is (5 > such that 

X — >X(} 

< \x — xo\ < S implies |/(x) — yo\ < e 
Now because lim x„ = xq, there is AT G N such that 

n— »oo 

n > N implies |x„ — xo| < 6 



Limits of Functions and Continuity 89 

Then ii n> N, we see that < |a;„ — xo| < S, so that \ f{xn) — yo\ < as required. Thus (a) 
impHes (b). 

(b) ^ (a): To prove the converse, we argue by contradiction: Suppose that lim f(x) / yo- 
(In fact, the limit need not even exist.) I.e. suppose that 

^ (ye > 36 > O^x e X [0 < \x - xo\ < 6 \f{x) - yo\ < e]) 

which is to say, that 

3e > 0^5 > 03x e X [\0 < X - xo\ < 5 A |/(a;) - yo\ > e] 

i.e. that there is a e > such that for every 6 > 0, we are able to find an x & X such that 

< |x — xo\ < 5 but — yo\> e 

In particular, if (5 = ^, we are able to find ctH Xfi G X such that 

0<\xn-xo\<^ yet \f{x)-yo\>e 

We have thus found a sequence {xn)n in X satisfying (i), (ii) such that /(x„) -/^ yo- Hence 
not-(a) implies not-(b), i.e. (b) implies (a). 

H 



Corollsiry 4.1.7 (a) If lim f(x) exists, then the limit is unique; 

(b) lim f(x)+ lim g(x) = lim (f(x)+g(x)); 

X^Xq X-^Xq X^Xq ^ ' 

(c) lim af(x) = a lim f(x); 

X—>X() x—>xo 

(d) ( lim f{x))( lim g{x)) = lim f{x)g{x) when f,g are between U. andR; 

x-^xo x-^xo x-^xo 

hm f{x) 

(^) ^iT,^° n(^\ — 1™ ^7^; when f,g are between M and M, provided that lim g(x) ^ 0. 
x^Q^ ' x-^xq yy-'') x-*xo 



Proof: These properties hold for limits of sequences. 



We end this section with a brief look at left- and right limits of functions: 



90 



Limits of Functions 



Definition 4.1.8 Suppose that X C R, and that f : X —^ We say that f{x) tends to 
Ho as X tends to xq from the right, and write 

lim^ f{x) = yo or f{x) yo as x i xq 

if and only if: For every e > 0, there is a 5 > such that 

|/(x) — yol < £ whenever < x — xq < 6 
Similarly, we say that /(x) tends to yo x tends to xq from the left, and write 
hm^ f{x) = yo or f{x) ^ yo as x ] xq 

if and only if: For every e > 0, there is a 5 > such that 

|/(x) — yol < £ whenever < xo — x < 5 



An example will make these ideas clear: 
Example 4.1.9 Define / : M ^ M by 

fix) = { 



X — 1 if X < 1 
1 if X = 1 
1 + x^ if X > 1 



Then lim /(x) = 0, and lim /(x) = 2. Moreover, lim /(x) does not exist. 

To see this, let e > 0, and choose 6 = e. Then ifl — 5<x< l,we also have 1 — e < x < 1. 
Now since /(x) = x — 1 when x < 1, we see that 

< 1 — X < 5 implies|/(x) — 0| < e 

Thus /(x) ^ as X t 1. 

Similarly, given e > 0, choose 6 > sufficiently small that 6{6 + 2) < e. Note that if 
< X - 1 < (5, then |/(x) - 2| = jx^ - 1| = (x - l)(x + 1) < 6{5 + 2), and thus 

0<x-l<5 ^|/(x)-2|<e 

This proves that /(x) — > 2 as x | 1. 

Finally, the fact that lim /(x) does not exist follows from the next proposition. Proposition 

□ 

Exercise 4.1.10 Show that if / : M — > M, then the following are equivalent: 

(a) lim_^ /(x) = yo 

(b) For every strictly decreasing sequence Xn — > xo it is the case that f{xn) yo- 



Limits of Functions and Continuity 



91 



□ 

The limit of a function exists if and only if both the left limit and the right limit exist, 
and are equal: 

Proposition 4.1.11 Suppose that X CM, and that / : X — > M. Then lim f(x) exists if 
and only if 

(i) Both lim f(x) and lim /(x) exist, and 
(a) lim^ /(x) = lim f{x) 
In that case, 

lim /(x) = lim /(x) = lim /(x) 



Exercise 4.1.12 Prove Proposition 4.1.11 



□ 

4.2 Continuity 

In this section, we define the notion of a continuous function, and provide several useful 
characterizations of this notion. We shall get the theoretical part over with as soon as possible, 
and then — in the next section — we will analyze several examples, with the aim of making 
these abstract ideas concrete. 

Intuitively, to say that / is continuous at a point xo G X means that as x "gets closer and 
closer" to xq in X, the function values /(x) "get closer and closer" to /(xq) in Y. Thus the 
following definition should be be devoid of mystery: 

Definition 4.2.1 (a) Let f : X ^ R he a function, where X C M, and let xq G X. We 
say that / is continuous at xq if and only if for all e > there is a 5 > such that 

|/(x) — /(xo)| < e whenever x £ X and |x — xo\ < 6 



I.e. 



Ve > 35 > Vx G X [|x - xol < 5 ^ |/(x) - /(xo)| < e] 



(b) A function / : X ^ M is said to be continuous if and only if it is continuous at every 
point in its domain. 

(c) A function is said to be discontinuous at xq if and only if it is not continuous at xq. 



Remarks 4.2.2 "It ought to be clear that continuity is a concept that applies to func- 

tions between metric spaces: If {X,dx) (y, dy) is a function between two metric 
spaces, then / is said to be continuous at xq if and only if 

Ve > 35 > Vx G X [dx{x,xo) <6 ^ (iy(/(x), /(xq)) < e] 



92 



Continuity 



i.e. f{x) can be made as close to /{xq) as you like (in the space (Y, dy)) by taking x to 
be sufficiently close to xq (in the space (X, dx))- 

• Note that lim f{x) = f{xo) if and only if for all e > there is a (5 > such that 

x—*xo 

\f{x) — f{xo)\ < £ whenever < |a; — xqI < 5 

The difference between saying that lim f{x) = /(xq) and saying that / is continuous at 

xq is therefore very slight: The former requires that xq is a cluster point of X, whereas 
the latter does not. 

• Note that if xq belongs to X, but is not a cluster point of X (i.e. if xq is an isolated 
point of X), then the statement " lim fix) = f{xo)" is meaningless, but the statement 

X—^Xq 

"/ is continuous at xq" is well-defined, and moreover, always trueU 

f 

Fact: If Xq is an isolated point of X, then any function X — > M is continuous at xq 

To see this, note that if xq & X is not a cluster point of X, then there is a 5 > 0, such 
that ii < \x — xq\ < S then x ^ X. It follows that ii x e X and \x — xo\ < 5, then 
X = Xq. But then, given any e > 0, we see that |x — xo| < S implies x = xq (because 
X e X), so that \ f{x) - fixo) \ = <e. 

□ 

To summarize these last remarks: 
Proposition 4.2.3 Let X CR, let xq G X, and let X ^R. 

• If xq is a cluster point of X, then f is continuous at xq if and only if 
lim f{x) = f{xo). 

X—^Xo 

• If Xq is not a cluster point of X, then f is continuous at xq (no matter what f is). 

Next, we discuss two characterizations of continuity: The first is in terms of sequences, 
and the second invokes topological notions. For this purpose, recall that 

Proposition 4.2.4 The following are equivalent: 

• Xn ^ Xq in M; 

• Whenever a set U C. R is a neighbourhood of xq (i.e. whenever xq is an interior 
point ofU ), then Xn & U eventually (i.e. there is N such that Xn & U for all n> N). 

Theorem 4.2.5 Let f : X R be a function, where X CR, and let xq G X. Then the 
following are equivalent. 

(a) f is continuous at xq. 

(b) If {xn)n is a sequence in X such that Xn — ^ xq, then f{xn) f{xo). 

(c) For every neighbourhood V of f{xo) there is a neighbourhood U of xq such that: 

f[x nu]cv 



Limits of Functions and Continuity 



93 



Proof: (a) =^ (c): Suppose that X ^ M is continuous at xq, and that y is a neighbourhood 
o /(xo) G V. Then /(xq) is an interior point of V, i.e. there is e > such that B(f{xo), e) := 
(/(xo) — s, /(xo) + e) is a subset of V. Choose now a 5 > such that |x — xo| < 6 imphes 
|/(xo) — /(x)| < e, and let U := B(xo, 6) := (xo — 5,xo + 6). Then 

xGC/nX ^ X G X A |x-xo| < (5 |/(x)-/(xo)| < e ^ /(x) e 5(/(xo),e) C y 

Thus [/ is a neighbourhood of xq, and f[U Ci X] V. 

(c) =^ (b): Suppose that x„ — > xq in X. We must show that f{xn) /(xq), i-e. that if 
V is any neighbourhood of /(xq), then /(x^) € ^ eventually. So let y be a neighbourhood 
of /(xo). By assumption, there is a neighbourhood U of xq such that /(x) E F whenever 
X G [/ n X. Since x„ — > xq, we must have Xn ^ U eventually. Hence /(x„) G V eventually, as 
well. 

(b) =^ (a): We prove this by contradiction. Suppose, therefore, that / is not continuous at 
Xo, i.e. that 

^(^Ve>0 3(5>OVxGX[|x-xo| < 6 ^ \f{x) - f{xo)\ < e]) 

i.e. that 

3e > OyS > 3x e X [\x - xo\ < 5 A |/(x) - /(xo)| > e] 

Thus there is an e > such that for every 5 > we can find an x G X with the properties 
that |x — xol < (5, yet |/(x) — /(xo)| > £■ Fix such an e > 0, and successively take (5 = ^ for 
n G N. This yields, for each n, an x^ G X such that |x„ — xo| < ^, yet |/(x„) — /(xo)| > e. 
Then x„ xq in X, yet f{xn) A /(^^o) in Thus not-(a) implies not-(b). 

H 

Up to now we have discussed continuity at a point. Here is a beautiful characterization of 
global continuity: 

Theorem 4.2.6 Let / : M ^ M. Then the following are equivalent. 
(1.) f is (everywhere) continuous. 

(2.) Fullbacks of open sets are open: Whenever U C M. is an open set, then its inverse 
image f~^[U] is open also. 

( 3.) Fullbacks of closed sets are closed: Whenever C QM is an closed set, then its inverse 
image f~^[C] is closed also. 

In abstract topology, (b) is used as a definition of continuous function! It uses only the notion 
of open sets: No e, no metric. 

Exercise 4.2.7 We prove Theorem |4.2.6[ 

(a) We first prove that (1.) ^ (2.): Suppose that / : M ^ M is continuous, and that [/ C M is 
open. We want to prove that f^^[U] is open, i.e. that every point of f^^lU] is an interior 
point. 

(i) So let Xq G f^^[U] Explain why there is e > such that B{f{xo),e) C U. 



94 



Operations on Continuous Functions 



(ii) Next, explain why there is (5 > such that x G B{xo,S) impUes f{x) G B{f{xo),e). 

(iii) Conclude that B{xo,S) C f~^[U], and thus that xo is an interior point of U. 

(b) Next, we show (2.) (1.): Suppose that / : M — M is such that pullbacks of open sets 
are open. We want to prove that / is continuous at every xq G M. So let £ > 0. 

(i) Explain why xq is an interior point of f~^[B{f{xo),e)]. 

(ii) Conclude that there is (5 > such that B{xo,S) C f~^[B{f{xo),e)]. 

(iii) Hence show that if \x — xq] < 6, then \ f{xo) — f{x)\ < £■ 

(iv) Conclude that / is continuous at every xq. 

(c) Use the fact that pullbacks commute with set operations to prove that (2.) <;=^ (3.). 

□ 

Finally, some definitions of one-sided continuity: 



Definition 4.2.8 Let f : X ^ R, where X C M, and xq G X. We say that / is right- 
continuous at Xo if and only if 

V£ > 3(5 > [0 < X - xo < (5 ^ |/(x) - /(xo)| < s] 

Similarly, / is said to be left-continuous at xq if and only if 

Ve > 3(5 > [0 < xo - X < (5 ^ |/(x) - /(xo)| < e] 



The following assertions are easy to see: 

Remarks 4.2.9 Let / : X ^ M, where X CR, and xq G X. 

• / is continuous at xq if and only if it is both left- and right-continuous at xo- 

• If Xo is a cluster point of X, then / is right-continuous at Xq if and only if lim^ /(x) = 
f{xo). 

• Similarly, / is left-continuous at the cluster point xo if and only if lim /(x) = /(xo). 

□ 

4.3 Operations on Continuous Functions; Examples 

This short section lists some simple results about the preservation of continuity. The first 
result shows that compositions of continuous functions are continuous. 



Theorem 4.3.1 Suppose that f : X ^ Y and that g : Y ^ R, where X,Y CR. Suppose 
further that f is continuous at xq, and that g is continuous at /(xo). Then g o f is 
continuous at xq. 



Limits of Functions and Continuity 



95 



Proof: We give three proofs, each involving a different characterization of continuity: 
First Proof: Let yo = f{xo), and suppose that £ > 0. Choose 6i > such that 

\y - yo\ < Si impHes \g{y) - g{yo)\ < £ 

Then (using 8i as your e), choose a 5 > 

\x — xq\ < 5 imphes \J{x) — J{xq)\ < 5i 

We then note that 

\x-xo\<5^ \f{x) - /(xo)| <6i^ \9{f{x)) - 9{f{xo))\ < £ 

Second proof: It suffices to show that if a;„ xq, then [g o {g o /)(a;o)). Now if 

Xn xq, then f{xn) /(a^o) because / is continuous at xq. It follows immediately that 
g{f{xn)) 9{f{xo)), because g is continuous at f{xo). 

Third Proof: We show that if F is a neighbourhood of g{f{xo)), then there is a neighbourhood 
U of xo such that 

xexnu ^ {gof){x)eV 

So let V he a neighbourhood of g{f{xo))- Since g is continuous at /{xq), there is a neigh- 
bourhood W of f{xo) such that 

yernw ^ 9{y)eV 

Then, because / is continuous at xq, there is a neighbourhood U of xq such that 

xexnu =^ f{x) G W 
Of course, since f : X ^Y, each f{x) G Y. Hence 

xexnu =^ f{x)eYnw g{f{x))ev 

as required. 

H 

Suppose that / : X — > M, where X C M. To begin with, note that to say 

lim f{x) = yo 
x—*xo 

is exactly the same as to say 

lim f{xo + h) = yo {xq + h e X) 

/i— >o 

To see this equivalence, just define h = x — xq. Then h i£ and only if x ^ xq, and 
f{xo + h)=f{x). 



Proposition 4.3.2 The absolute value function 



M — ^ M «s continuous. 



96 



Operations on Continuous Functions 



Proof: We use the sequence characterization of continuity. It suffices to show that if Xn ^ x 
in M, then \xn\ — > \xo\ in M. But this we already know. (But in case you have forgotten, just 
note that | — |a;o| | < \xn — xo\.) 



Note that if / : X — > R, then the function |/| : X — > R : j; i— > is simply the 

composition of / with | • |, i.e. |/| = | • | o /• Thus: 

Corollary 4.3.3 // / : X — > R is continuous, then so is \f\:X^ R. 

The following result is a trivial consequence of the properties of limits: 



Theorem 4.3.4 // /, g are continuous at xo, and i/ a G R, then 

f 

f + gJg,o!f,- 

9 

are also continuous (when they are defined, and provided no division by zero occurs). 



Of course, for - to be well-defined, q must be a non-zero real-valued function. 



Proposition 4.3.5 If p{x) is a polynomial with real coefficients, then p : R ^ R is 
continuous. 



Exercise 4.3.6 Prove Propn. 4.3.5 



□ 



Exercise 4.3.7 We prove that the function ^J~ : R+ ^ R is continuous. 

(a) Suppose that {yn)n is a non-negative sequence in R which converges to y. Explain why 
it suffices to show that then also — > y^. 

(b) We consider, separately, two cases: y > and y = 0. First assume that y = 0, i.e. that 
?/„ — > 0. Prove that in that case also 0. 

(c) Next assume that y > 0. Explain why there is a ii' > such that y > K and such that 
Un ^ K eventually. 

(d) Now notice that {^/y:^ - ^/y){^/ihi + ^/y) = {Vn - y), and that (^/y^ + ^) > 2\^. Use 
these facts to show that — > ^/y in this case also. 

[Hint: Choose N such that |y„ — y\ < 2\/l^e whenever n > N.] 



□ 



Example 4.3.8 Trigonometric functions are continuous on their domain. 



Limits of Functions and Continuity 



97 



• To begin with, let us prove that the function sin(x) is continuous at 0. Recall the 
following inequality: 

I sin(x)| < |x| 

This is easy to see geometrically: Consider a unit circle centered at the origin, and draw 
a ray at an angle of x radians with the positive x-axis. The ray intersects the circle at 
the point (cos(x), sin(x)). The length of the arc from the point (1,0) on the x-axis to 
the point of intersection is |x|. the perpendicular height of this point is | sin(x)|. 

• Thus a sandwich argument shows that lim sin(x) = 0, i.e. that sin(x) is continuous at 

a;— >0 

X = 0. 

• Next, note that 



. , . ... o ■ ( ^ ~ ^0 \ / X + Xo 

sm(xj — sm(xoj = 2sm \ — z — I cos 



and thus that 



I sin(x) — sin(xo)| < 2 



sm 



X — Xo 



Let e > 0, and choose 5i > such that \h\ < 5i implies |sin(/i)| < |. Then choose 
6 = 25i. It is now easy to see that 



|x — xo\ < 6 



X — Xo 



< 6i 



sm 



X — Xo 



e 

<2 



I sin(x) — sin(xo)| < e 



Now the continuity of cos(x) follows from the relationship sin^(x) +cos^(x) = 1 and the 
continuity of ^/x. 



All other trigonometric functions are ratios involving sin(x) and cos(x), and are thus 



continuous, by Theorem 4.3.4 



□ 



Example 4.3.9 Consider the function / : 

fix) := I 



i defined by 

if x^ > 2 
if x^ < 2 



Clearly / has a "jump" at ib\/2. However, we claim that / is nevertheless continuous: after 
all, ±\/2 Q. To prove it, let xo G Q, and let e > 0. We must find 5 > such that 
|/(x) — /(xo)| < e whenever |x — xo| < S. There are three possibilities: 

Case 1: xo < — \/2. Choose 6 > such that |x — xq\ < S implies x < —\/2. Then |/(x) — 
/(xo)| = 0<e. 

Case 2: — \/2 < xq < V^- Choose 6 > such that |x — xo| < S implies — \/2 < x < \/2. Then 
|/(x)-/(xo)| = 0<e. 

Case 3: xq > \/2. Choose S > such that |x — xo| < S implies x > \/2. Then |/(x) — /(xo)| = 
< e. 



□ 



98 



Continuous Functions on Compact Sets 



Exercise 4.3.10 Let X = (-1,0) U (0, 1). Define / : X ^ M by 




ii X £ X and a; < 

1 a X e X and x> 



Show that / is continuous on X. 

□ 

Example 4.3.11 We exhibit Dirichlet's function, a function which is discontinuous at every 
point. 

Define / : M ^ M by 




1 if X is rational 
if a; is irrational 



To see that / is everywhere discontinuous is rather easy: Fix xq G M. Choose two sequences 
Xn Xq and yn — > Xq such that {xn)n consists only of rationals, and {yn)n consists only of 
irrationals. Then we cannot have both Y\m.nf{xn) = /(a^o) ^.i^d \\m.nf{yn) = f{xo), because 
then = 1. 

□ 

Exercise 4.3.12 We exhibit a function which is continuous at every irrational number, but 
discontinuous at every rational number. 

Let X = (0, oo), and define / : X — M as follows: If x is irrational, define /(x) =0. If x is 
rational, write x = ^, where m, n are relatively prime positive integers, and define /(x) = ^. 

(a) First show that / is discontinuous at every rational number x > 0. 
[Hint: Consider a sequence (x„)„ of irrational numbers converging to x.] 

(b) Explain why the interval (x — x + contains only finitely many rational numbers which 
have denominators that are < n. 

[Hint: The interval has length 1, so it can contain no more than 1 integer, no more than 
two integer multiples of ^, no more than three integer multiples of |, etc.] 

(c) Next we show that / is continuous at every irrational number x > 0. So let x be an 
irrational number, let £ > 0, and choose n G N such that ^ < £• Explain why we are able 
to choose a 5 < ^ such that the interval {x — 6,x + 6) contains no rational number with 
denominator < n. 

(d) Conclude that < f{y) < ^ < e for every ye {x — S,x + S). 

(e) Deduce that / is continuous at x if x is irrational. 



□ 



Limits of Functions and Continuity 



99 



4.4 Continuous Functions on Compact Sets 

Recall that a set i^T C M is compact iff it is sequentially compact iff it is closed and bounded. 
The following result is often regarded as "intuitively clear": 



Theorem 4.4.1 (Intermediate Value Theorem) 

Suppose that f : [a,b] ^ M is a continuous function, and that f{a) < f{h). Suppose that 
y GM is such that f{a) < y < f{b). Then there exists an x £ [a, b] such that f{x) = y. 
A similar result holds if f{b) > f{a). 



Before we prove it, we make some remarks: 
Remarks 4.4.2 Consider the function / : Q n [—2, 1] 



f{x) :-- 



if > 2 

1 if < 2 



A similar function was studied in Example 4.3.9 from which it is clear that / is continuous 



on Q n [—2, 1]. However, the Intermediate Value Theorem clearly fails to hold, as /(— 2) = 
0, /(I) = 1, yet there is no x G Q n [-2, 1] such that /(x) = ^. 
Now as Korner points out in his book A Companion to Analysis, 

"If this theorem is "intuitively clear" over M, it ought to be intuitively 
clear over Q. 

What makes the Intermediate Value Theorem true in M is, of course, the Completeness Axiom. 
Indeed, the Completeness Axiom is, in a sense that can be made precise, equivalent to the 
Intermediate Value Theorem. 

□ 

Proof: Inductively, we define two sequences (an), {bn) such that 

(i) We have 

ai < a2 < a3 < ■ ■ ■ < ttn < a^+i < • • • < 6n+l < &n < • • • < 63 < &2 < 

(ii) Furthermore, f{an) <y< f{bn) for all n S N 

(iii) Finally, 6„ - a„ = (6 - 0)2""+^ for ah n G N. 

Define ai := a, 61 := b. Then (i), (ii), (iii) (up to n = 1) are clearly true. 
Now suppose that a„, 6„ have been defined such that 

(i) n ai < a2 < • • • < a„ < 6„ < • • • < 62 < 61; 

(ii) n /(a„) <y < f{bn); 

(iii) „ 6„-a„ < (6-a)2-"+i. 



100 



Continuous Functions on Compact Sets 



Define Cn ■= . There are now two possibilities: 

Case I: If /(cn) < y, define a„+i := Cn and bn+i := bn- 
Case II: If /(c„) > y, define a„+i := a„ and bn+i = Cn- 
In either case we see that 

SO that (i)n+i holds. Moreover, certainly /(a„,_|_i) < y < so that (ii)n+i is true. 

Finally, 

{bn+i - a„+i) = libn - an) = " 0)2-"+^ = (6 - a)2-" 

proving (iii)„+i. 

This completes the indcutive definition of the sequences (a„,), Note that each an,bn G 
[a, 6]. Further observe that lim„(6„ — a„) = (6 — a) lim„ 2""+^ = 0. 

As (a n) is a bounded increasing sequence (bounded above by b — 6i, for example), it is 
convergent — here is where the Completeness Axiom makes its appearance. Therefore let 
X := lim„ Qn- Note also that bn = an + {bn — an), so that 

lim bn = lim an + lim(6,i — a„) = x + = x 

n n n 

i.e. lim„ a„ = a; = lim„ 6„. Clearly x G [a, b] also. 

Now we use the sequence criterion for continuity: Since / is continuous and a„ x, we 
have /(a„) — > /(x). We conclude that f{x) < y, as each /(a„) < y. Similarly f{bn) fix), 
from which we obtain f{x) > y. We conclude that f{x) = y, as required. 



The following corollary is of fundamental importance for optimization. It, too, is often 
regarded as "obvious" . 



Theorem 4.4.3 Suppose that f : [a,b] ^ M is continuous, Then f attains its infimum 
and supremum on [a,b\, i.e. there exist G [a, 6] such that 

f{x*)= sup f{x) f{x^)= inf /(x) 



Exercise AAA We prove Theorem 4.4.3 



(a) Let K := [a, 6]. We show that / attains its supremum on K, i.e. that there \s x* ^ K 
such that /(x*) = sup /[IT] . We begin by showing that 

f[K] := {/(x) -.x^K] 

is a subset of M which is bounded above. 

a.(i) Suppose that f[K] is not bounded above. Explain why there is a sequence (x„) in 
K such that f{xn) > n. 

a.(ii) Explain why the sequence (x„) has a convergent subsequence. 

a.(iii) Explain why the sequence {f{xn)) has a convergent subsequence. 

a.(iv) Explain why this leads to contradiction. 



Limits of Functions and Continuity 



101 



We thus see that the set f[K] must be bounded above. 

(b) Explain why y* := sup /[if] exists. 

(c) Explain why there is a sequence in K such that f{xn) y* ■ 

(d) Explain why there is a subsequence (xn^) of {xn) which converges. 

(e) Define x* := linifc to be the limit of this subsequence. Explain why x* E [a, b]. 

(f) Also explain why f{x*) = y* . 

(g) Where is the Completeness Axiom used in this proof? 

□ 



In fact, we can do better than Theorem 4.4.3 Recall that if / is continuous, then the 



inverse image of an open set is open. Next, we show that if / is continuous, then the direct 
image of compact sets is compact. 



Theorem 4.4.5 Suppose that / : M ^ M is continuous, and that K (1 M. is compact. 
Then f[K] is compact 



Proof: Let U := {Ui : z G /} be an open cover for /[if]. We must show that U has a finite 
subcover, i.e. that there are ii, . . . ,in & I such that f[K] C Uj=i ^ij- Define Vi := f~^[Ui] 
for i £ I. Since / is continuous, pullbacks along / preserve open sets, i.e. each Vi is open. 
Now note that 

xGK =^ f{x)e f[K]c\JUi =^ xe f-'[\JUi] = \JVi 

It follows that V := {Vi : i G i} is an open cover for K. Since K is compact, there are 
ii,...,inel such that K C U"=i Vi^. Then K C /-i[U"=i Ui^], and thus f[K] C U"=i Ui^, 
i.e. Uij^, . . . , C/j,j form a finite subcover of K. 



We immediately obtain the following improvement of Theorem 4.4.3 



Corollary 4.4.6 Suppose that / : M ^ M is continuous, and that if C M is compact and 
non-empty. Then f attains its infimum and supremum on K, i.e. there exist x*,x^, £ K 
such that 

f{x*) = sup f{x) f{x^) = inf /(x) 



Proof: f[K] is compact, hence closed and bounded. Since f[K] is bounded, u := sup /(x) 

is finite. Since f[K] is closed, u G f[K] (To see this, note that u — ^ is not an upper bound 
of /[if], being smaller than the smallest upper bound. Hence there is ?/„ G /[if] such that 
n — ^ < y„ < M, and thus y„ is a sequence in /[if] with limit u. Since /[if] is closed, it is 
closed under limits, and so u G /[if].) Thus there is x* G if such that /(x*) = u. 



102 Continuous Functions on Compact Sets 



Chapter 5 

Differentiable Functions 



In this chapter we shaU mainly concern ourselves with the differentiation of real-valued func- 
tions of one variable. Moreover, we shall not even begin to attempt to cover the colossal body 
of applications of differential calculus — that has already been accomplished in standard 
courses on uni- and multivariate calculus. Our aim is more modest: To provide a rigorous 
development of the theory behind the differential calculus. 



5.1 Differentiation in R 



We shall define the notion of derivative for functions f : X ^ M, where X C M. Usually, we 
shall assume that X is an open or closed interval, but for the moment, we would like an as 
general definition as possible. 



Definition 5.1.1 Let f : X ^Mhe a function, where X C M, and suppose that xq both 
belongs to X and is a cluster point of X. We shall say that / is differentiable at Xo if and 
only if the limit 

Hm M^IM g X) 

x^XQ X — Xq 

exists. In that case, we call this limit the derivative of / at xq, and denote it by /'(xq). 
/ is said to be differentiable if and only if it is differentiable at every point in its domain. 



Remarks 5.1.2 (1.) Thus / is differentiable at xq if and only there exists a number L such 
that 

fix) - /(xo) 



Ve > 3(5 > Vx G X 
Then /'(xo) = L. 



< \x — xo\ < 6 



L 



X — Xq 



< e 



(2.) Xo is required to be a cluster point of X so that we can find x E X that are arbitrarily 
close to Xo (without having x = xq). xo is required to be an element of X so that /(xq) 
(which occurs in the definition of /'(xq)) exists. 



(3.) Note that if X is an interval, then any point that belongs to X is also a cluster point of 
X. Generally, therefore, we do not need to worry about the "cluster point condition". 



103 



104 



Differentiation in M 



(4.) Equivalently, if the domain X of / is an open interval, then 

/(xo + h)- f{xo) 



f'{xo) = Vim 

h—*0 



h 



(5.) Just hke we can use the order relation on M to define one-sided limits, we can also define 



one-sided derivatives. For example, if lim 



fix)-f{xo) 



+ x-xo 



exists, we call it the right-hand 



derivative of f at xq. 

Indeed, if X = [a, b], then f'{a) (as we have defined it) is just lim /Mii^M_ 

(6.) If / is differentiable (i.e. differentiable at every point in its domain), then /' is itself a 
function, /' : X ^ M. It then makes sense to ask whether /' is differentiable at a point 



Xq, i.e. whether lim 



f'(x)-f'{xo) 



X^Xq 



x—xo 



exists. If it does, we call this limit the second derivative 



of / at Xq, and denote it by f"{xo). 

This easily generalizes to higher order derivatives. The n^^ derivative of / at xq is denoted 
by /(")(xo). 



Exercise 5.1.3 Show that if / : 

f{x) = nx^^^. 

[Hint: Put h = x — xq- Then 



□ 



is defined by /(x) = x", where n S N, then 



□ 



Theorem 5.1.4 Suppose that f : X is differentiable at xq. Then f is continuous at 

Xq. 



Proof: This follows easily from the properties of limits: 

7(^)-/(^o) 

X — Xq 

'fix)-f{xo) 



lim [f{x) - f{xo)] = lim 

x—>xo x~^xo 

= lim 

X^Xq 



X — Xq 

f'ixo) -0 = 



• {x - Xq) 



lim {x — Xq) 
x—>xq 



It follows immediately that if a function is discontinuous at a point, then it cannot be 



differentiable there. Note, however, that the converse of Theorem 5.1.4 is false: if a function 



is continuous at a point, it need certainly not be differentiable there. The function f{x) = \x\ 
is easily shown to be continuous at xq = 0, but not differentiable there. 

Remarks 5.1.5 Indeed, there are functions which are continuous at every point, but differ- 
entiable nowhere. Just try to imagine such a function! 

Functions which are everywhere continuous but nowhere differentiable used to be regarded as 
"pathological" curiosities. These days they are used every day by physicists and actuaries to 
model random phenomena (such as Brownian motion, or stock prices). 



Dijferentiable Functions 



105 



□ 

The following representation is often useful: 

Lemma 5.1.6 Let f : X ^ M. Then f is dijferentiable at xq in X if and only if there is 
exists a number a G M and a function u : X such that 

(i) u{x) — > as a; — > xq, and 

(a) f{x) = f{xo) + {x- xo)[a + u{x)] 

In that case a = f'{xo). 



Proof: Suppose that / is differentiable at xq. Define 

U[X) = < X - xo 

[ else 

Then clearly u{x) ^ as x ^ xq. 

Conversely, suppose there are a number a G M and a function u with the properties stated 
above. Then u{x) = '^^^IZ^^^^'^ ~ o whenever x ^ xq. The fact that lim u{x) = now easily 



implies that f'{xo) exists and that /'(xq) = a. 



We must now settle several debts that remain unpaid from a first year course in calculus: 

Theorem 5.1.7 Suppose that f,g : X M are differentiable at xq £ X. Then the 
functions f + g,fg and ^ are also differentiable at xq (assuming g{xo) ^ in the case of 

-), and 

(a) {f + gy{xo) = nxo) + g'{xo) 

(b) ifgYixo) = f'{xo)g{xo) + f{xo)g'{xo) 
/, f f\' r N g{xo)f'{xo) - g'{xo)f{xo) 



Proof: (a) follows immediately from the properties of limits of functions. 
To prove (b), define h[x) = f{x)g{x). Then 

h{x) - h{xo) = f{x)[g{x) - g{xo)] + g{xo)[f{x) - /(xq)] 

and thus 

h{x)-h{xo) g{x)-g{xo) , .fix) - f{xo) 
= f{x) h g{xo) 

X — Xo X — Xo X — Xo 

we have lim /(x) = /(xq). Now use the 



Now let X ^ Xo. Note that by Theorem 
properties of limits of functions to obtain 



5.1.4 



X^Xq 



h'{xo) = f{xo)g'{xo) +g{xo)f'{xo) 



106 



Differentiation in M 



Finally, to prove (c), define h{x) = 
h{x) — h{xQ) 1 



x-xo g{x)g{xo) 
Letting x ^ xq yields the result. 



^1^, and note that 



'g^^^)M^IM.f^,^^9{x)-g{xo) 



x — Xo 



x — Xq 



Theorem 5.1.8 (Chain Rule) 
Suppose that 

(i) f : X ^ R and g : Y ^ 
f[X] C Y. 



are functions, where X, Y are intervals in M such that 



(a) f is differentiahle at xq ^ X. 
(Hi) g is differentiahle at f{xQ) £ Y. 
Then g o f is differentiahle at xq, and 

{g ° fYi^o) = g'if{xo))f'{xo) 



Proof: Let h = gof, and let yo = /(xq). Since /, g are differentiahle at xq and yo respectively, 
there exist, by Lemma 5.1.6 two functions n:X^M, t;:^— >]R such that 



f{x) = f{xo) + {x- xo)[f'{xo) + u{x)] 

a{y) = g{yo) + {y- yo)[g'{yo) + v{y)] 



x-,xo u{x) = 0, lim v{y) = 0. 
y-*yo 



and where lim. 
Now let y = f{x). It follows that 



where 



h{x) - h{xo) = g{y) - g{yo) 

= {y- yo) ■ [g'iyo) + v{y)] 

= [f{x)-f{xo)]-[g'{f{xo)) + v{f{x))] 

= {x- xo) ■ [/'(xo) + n(x))] • [g'ifixo)) + v{f{x))] 

= (x - xo) • [g'{f{xo))f'{xo) + k{x)] 

k{x) = u{x)g'{f{xo)) + v{f{x))f'{xo) + n(x)w(/(x)) 



Now as X — > Xq, we see that /(x) /(xq) (because / is differentiahle at xo, and thus 

continuous there). This implies that lim v{f{x)) = (because lim v{y) = 0, and yo = /(xq), 

x^xo y^yo 

with y = /(x)). Hence lim k(x) = 0. By Lemma 

x^xo 

at Xq, and that {g o f)'{xo) = g'{f{xo))f'{xo). 



5.1.6 



it follows that g o f \s differentiahle 



Differentiable Functions 



107 



Example 5.1.9 Consider the function 

f{x) = sin i 

which is defined for all x ^ 0. This function has a peculiar property: It bounces between — 1 
and 1 infinitely often on any interval of the form (0,£). To see this, note that 



sm 






if X 


1 


if X 


-1 


if X 



2 

2n7r' 
2 

(4n+l)7r ' 
2 

(4n+3)7r ' 



n G N 
n e N 

n G N 



Thus ifx = 7r-,^,:r-,^i?r-,---, then sin - = 0, —1, 0, 1, 0, ... . Now draw the graph of sin -. 

Note that /(x) is not defined at a; = 0. There is also no way that we can define /(O) in 
such a way as to make / continuous at zero. For if we could define /(O) to make / continuous, 
then we would have ^ 

/(O) = lim sin 



But we can find two sequences Xn — > 
follows that hm sin - does not exist. 

Next, consider the function 



x^O X 

and yn — such that 



1 and f{yn) -1. It 



xsm 



which is also defined for x ^ 0. In this case, however, we can define ^(0) in such a way as 
to make g continuous: Simply put g{0) = 0. It is clear that whereas f{x) bounces between 
— 1 and 1, g{x) bounces between —x and x. A simple sandwich argument proves that g is 
continuous at 0: 

11 , 1, , , 
— X < xsin — < X 

X 

and thus lim x sin - = 0. 

However, though g is continuous at x = 0, g is not differentiable there: For 



9'{0) 



lim 

x^O 



xsm 



1 







X 



lim sin — 

x~*0 X 



and lim sin ^ does not exist. Thus g'{0) does not exist either. 
Next, consider the function 



h{x) 



x^ sin 



X 







if X 7^0 
if X = 



A simple sandwich argument shows that h is continuous at x = 0. /i is also differentiable 
there: 

, x^sin ^ — 1 
h (0) = lim ^ = lim xsin — = 



x^O 







However, the derivative h' is not continuous at x = 0: 



lim h'(x) = lim 

x-*0 x^O 



2x sin 



x^ 1 

cos — 

X^ X 



lim cos — 

I— >0 X 



108 



Mean Value Theorems 



It is easy to show that hm cos - does not exist, for the same reason that hm sin -, does not 

exist. Hence hm h'{x) does not exist, and thus it can certainly not be equal to h'{0). This 

x— >o 

shows that h' is not continuous at x = 0. 

Since a differentiable function is necessarily continuous, /i"(0) does not exist. 

□ 

5.2 Mean Value Theorems 

In this section, we continue to repay our debt to first year calculus, finally providing rigorous 
proofs for theorems that were just on loan thenj^ Thus we state and prove several well-known 
results, without probing for applications. 

Definition 5.2.1 Let f : X ^ M, where X C M. We say that / has a local maximum 
at xq £ X if and only if there is a neighbourhood U of xq such that f{x) < /{xq) for all 

xeU. 

f has a local minimum at xq if — / has a local maximum at xq. 



Remarks 5.2.2 It is easy to see that / has a local maximum at xq if and only if there exists 
a 6 > such that 

fix) < f{xo) whenever \x — xo\ < S (x £ X) 

□ 

Perhaps the single most useful fact in differential calculus is the following: 



Theorem 5.2.3 Suppose X is an interval, and that xq is an interior point of X . If a 
function f : X has a local maximum at xq, and if f'{xQ) exists, then /'(xq) = 0. 



Proof: Choose 5 > such that 

f[x) < f{xo) whenever \x — xo\ < 5 {x £ X) 

and such that also (xq — 6, xq + 6) C X (which can be done because xq is assumed to be an 
interior point of X). Now if xq — (5 < x < xq, we have 

fix) - /(xq) ^ ^ 
X — Xq ~ 

because /(x) — /(xq) and x — xq are both negative. On the other hand, if xq < x < xq + 5, 
then 

fix) - /(xq) ^ ^ 
X — Xq ~ 

because /(x) — /(xq) is negative, but x — xq is positive. Hence if lim ^^^^SJIT^ exists, then 

X—fXQ X X u 

it is both > and < 0. The result follows. 



I'll bet that the rate of interest is very low. . . 



Dijferentiable Functions 



109 



Theorem 5.2.4 (Rolle's Theorem) 

Suppose that f : [a,b] — > M is continuous, with f{a) = f{h) = 0. Suppose further that 
that f is differentiable on the open interval (a, 6). Then there exists a cG (a, 6) such that 
fie) = 0. 



Proof: This is obvious if / = on [a,b]. Suppose therefore, that / is not identically zero. 
Thus there exists an x G (a, b) such that f{x) / 0. Replacing / by — /, if necessary, we may 
assume that / takes on some strictly positive value. 

Now [a, b] is a compact interval, and / is continuous. Hence / attains its supremum, i.e. 
there is c G [a,b] such that /(c) = sup f{x). Since this supremum is > 0, and since 



a<x<b 



f{a) = = f{b), we see that in fact c G (a, 6). Thus /'(c) exists, and, by Theorem 5.2.3 
have /'(c) = 0, as required. 



we 



The next theorem states that if / : [a, 6] — > M is differentiable, there is some point c between 
a and b such that the slope of the tangent at point c is equal to the slope of the chord joining 
the points (a, /(a)) and (&,/(&)) on the graph of /. This is a central result in differential 
calculus: 



Theorem 5.2.5 (Mean Value Theorem) 

Suppose that / : [a, 6] — > M is continuous, and that f is dijferentiable on the open interval 
(a, b). Then there exists a point c G (a, b) such that 

f{b)-f(a) = f\c){b-a) 



In fact, we can do better: 



Theorem 5.2.6 (Generalized Mean Value Theorem) 

Suppose that f,g are continuous on the closed interval [a,b], and differentiable on the open 
interval {a,b). Then there exists a point c G {a,b) such that 

[f{b)-f{a)]g'{c) = [g{b)-g{a)]f'{c) 



This result is also called the Cauchy Mean Value Theorem. The proof is an exercise: 

Exercise 5.2.7 (a) Prove the Generalized Mean Value Theorem for the case g{b) ^ g{a) by 
applying Rolle's Theorem to the function 

h{x) = fix) - f{a) - ~ {g{x) - g{a)) 
g(b) - g{a) 

Also prove the result in case g{b) = g{a). 
(b) Hence prove the (ordinary) Mean Value Theorem. 

□ 

As an (important) application, we prove L'Hopital's Rule. Here's a warm-up exercise: 



110 



Mean Value Theorems 



Exercise 5.2.8 (1) Suppose that f,g are on [a,b] and differentiable on {a,b). Suppose fur- 
ther that /(a) = = g{a), and that g,g' do not vanish on {a,b). We prove that 



x-^a g{x) x^a g'{x) 

(a) Choose x such that a < x < b. Use the Generalized Mean Value Theorem to prove 
that there is a c G (a, x) such that 

m _ fjc) 

g{x) g'c) 

(b) Now let X ^ a, and deduce the required result. 
(2) Define 



ir X 1 



Inx 

1 if a; = 1 
Show that / is continuous at x = 1. 

□ 

Theorem 5.2.9 (L'Hopital's Rule) 

Suppose that f,g are differentiable on {a,b), where — oo < a <b < oo, and that g'{x) ^ 
for X G (a, b) . Suppose further that 

x-*a g'[x) 

Then if either 

(a) lim f{x) = = lim g(x), or 

x^a x—^a 

(b) lim g{x) = oo, 
then also 



■a g{x) 

A similar result holds if x ^ b or g{x) — > — oo. 



Proof: Step I: First suppose that — oo < L < +oo. 

9'W 



(a) Now assume also that lim = = lim g{x). Let £ > 0. Since L as x ^ a, there 



exists a number c G (a, 6) such that 

,. , < L + e whenever a < x < c 
Then if a < x < y < c, we see that there is a f G {x, y) such that 

9m{^)-m] = fit)W)-g{y)] 



Differentiable Functions 



111 



by the Generalized Mean Value Theorem. It follows that 

m - m fit) 



g{x) - g{y) g'{t) 

(because a <t < c) and thus 



< L + e 



a{x) - g{y) 

< L 

m 



Now let .T — > a to obtain < L + e. Thus we have shown that there is a number c > a 

^ ^ 9{y) - 

such that 



, , < L + £ whenever a < y < c (*) 

9{y) 

(b) Now assume that lim g{x) = +00. As above, we can find a d > a such that 

x—*a 

, . , < L + £ whenever a < x < c' 
9\x) 

and thus, by a similar appHcation of the Mean Value Theorem, that 

f{x) - f{y) 



9{x) - g{y) 



< L + £ whenever a < x < y < c' 



Now fix y G (a, c'). Because g{x) — > 00 as x ^ a, we can find a c" such that a < c" < y such 
that 

g{x) > g{y) and g{x) > for all a; G (a, c) 

Then 

fix) - f{y) _ fit) ^ ^ ^ _ 
9{x) - g{y) g'{t) 

Multiply both sides of this equation by to obtain, after some rearranging, 

44 < (L + e) fl - 44) + 4^ for all X E (a, c") 
9{x) V 9{x)J 9{x) 

Now as a; — s- a, g(x) 00, and thus — > 0, — 0. It follows that {Ml < L + £ foi x 
sufficiently close to a, i.e. there is c > a such that 

fix) 

, , < L + £ whenever a < x < c (**) 
9{x) 

Comparing (*) and (**), we see that in both cases (a) and (b) we have shown that there 
exists a c > a such hat < L + £ whenever a < x < c. 

A similar argument shows that, for both cases (a) and (b), we can find c' such that 

fix) 



9ix) 



> L — £ whenever a < x < c' 



Now combine these results: Given any e > 0, we can find a c such that L — £ < < L + e 
whenever x G (a, c). This is easily seen to imply that L. 



112 



Mean Value Theorems 



Step II: Now assume that L = — oo. 

Further assume that hm /(x) = = Um g{x). Let K be any real number. Arguing as in 

Step 1(a), we can find c > a such that < K whenever a < x < c. (Just replace L + e by 
K.) Arguing as in Step 1(b), we can conclude the same thing if lim g{x) = +00. Since K is 

x-^a 

arbitrary, it follows that —00 = L as x — a. 

A similar argument works if L = +00. 

H 

Exercise 5.2.10 Fill in the missing arguments in Step II of the proof of L'Hopital's Rule. 



□ 



Exercise 5.2.11 Discuss the following argument: 

Statement: // / : M — > M is differentiable, then the derivative function f is 

continuous. 

Proof: 

(i) To prove that a function u : M — >■ M is continuous at a point x, it suffices to 



show that lim u(x + h) = u(x) 



(ii) Suppose that / is differentiable at x. Define g{h) = ii2dJlLJi±l _ Then f'{x) = 
Vim g{h). 

(iii) By L'Hopital's rule and the chain rule 

fix) = lim g{h) = lim + " -^^^^ = lim fi^ + ^yi-O ^ 

(iv) Since lim j' (x + h) = f'{x), it follows that /' is continuous at x. 

/i— >o 

(v) Since x was an arbitrary point, /' is a continuous function. 



Exercise 5.2.12 Define 

f{x) = x + cos X sin x g{x) = e^™^f{x) 
(a) Show that lim fix) = +00 = lim g(x) 

x—*oo x—*oo 

(h) Show that lim does not exist. 

^ ^ x^oo 9{x) 

(c) Show that f'{x) = 2(cosx)^ and g'{x) = e^™^ cosx[2cosx + f{x)] 

(d) Show that 4^ = 2^""°"^°^^ if cos X 7^ 
^ ^ g'{x) 2cosx + /(x) ^ 



□ 



Differentiable Functions 



113 



(e) Conclude that lim ^ = 0. 

Here we have a situation where f,g are differentiable functions with lim f{x) = oo = 

x— >oo 

lim f{x), yet I'Hopital's rule doesn't seem to work! The reason is as follows: We saw that 

>oo 



g'{x) 2cosx + f{x) 



if cosx 7^ 0. However, as x — > oo, cosx = for infinitely many x. It 



f'(x) 

is therefore not the case that . = . for all x. 

g \x) 2cos,x + j[x) 

For lim ^-Tf\ to exist, it is necessary that g'ix) is non-zero for all sufficiently large x. Sim- 



X— >oo 



ilarly, for lim =^7}^ to exist, it is necessary that g'ix) is non-zero in some neighbourhood of 



i,LLy, L^L 11111 

x-^a 

a. 

□ 



The final result in this section is Taylor's Theorem, which plays an important role in 
numerical analysis. Recall that f^^^ denotes the k^^ derivative of /. Also, let f^^^ = f. 
Finally, recall that 0! = 1. 



Theorem 5.2.13 (Taylor's Theorem) 

Let n G N. Suppose that f : [a,b] ^ R is a function with the property that 

(i) f, f, f", . . . , f("'~^^ are defined and continuous on [a, b]; 

(a) /^"^ exists on {a,b). 

Suppose further that a, (5 are real numbers such that a < a < (3 < b. Then there exists a 
7 G (a, (3) such that 

fe=0 



Proof: For x G [a, b] , define 

fe=0 

We must show that there is a 7 G (a,/?) such that = P(/3) + ^^-^{P - a)". Now let 

f{(3) = P{P) + M{P-ar 

and define 

g{x) = f{x) - P{x) - M{x - a)"" {x G [a, b]) 
Note that g{(3) = 0, by definition of M. Note also that 

^W(a) = /W(a) - P^''\a) = A; = 0,l,...,n-1 

because = /(fc)(a) for such k. 



114 



Mean Value Theorems 



We must show that M = ' for some 7 between a and /3. Note that P{x) is an 

(n — 1)*^ degree polynomial in x, so that p(")(x) = 0. It follows that 

5(")(x) = /(")(x)-n!M 

Thus if we can find a 7 such that g'^"'\'^) = 0, we will have M = ^^—j^p^, as required. 

Now both (7(0) = and g{P) = 0. By the Mean Value Theorem, there is 71 G {a, j3) such 
that 5'(7i) = 0. Thus both g'{a) = and 5''(7i) = 0. By the Mean Value Theorem, there is 
72 G (a, 71) such that g"{'^2) = 0. Thus both g"{a) = and g"{'y2) = 0. By the Mean Value 
Theorem, there is 73. . . 

After n — 1 steps, we obtain, from the fact that both g^"-^^\a) = and g^^^^^\'yn-i) = 0, 
a 7n G (a, 7n-i) such that 5^"^(7n) = 0. 

7„ is therefore the 7 that we seek. 

H 



Remarks 5.2.14 Note that if ra = 1, Taylor's Theorem is just the ordinary Mean Value 
Theorem. 

□ 



If / is a function with derivatives of all orders, then the Taylor series of / about a point 
X = a is defined by 

> ^(x-a)" 

n=0 

We will have more to say about Taylor series in a future chapter. For the moment, however, 
let's just show that the well-known Taylor series for converges, and to the right value, for 
each X G M: 

Exercise 5.2.15 

Let / be the function f{x) := e^. Fix x / 0, and let [a,b] be an interval containing both 
and X. The aim of this problem is to show that the power series 

k=0 

converges to e^. 

5.1 Show that the power series (f) converges, irrespective of the value of x. 

5.2 Use Taylor's Theorem, as stated above, to show that 

n 

'''' = J2j^ + ^n+l{x) 
k=0 



where 



|-Rn+i(a;)| < C- — T C is a constant 

[n + Ij. 



Differentiable Functions 



115 



5.3 Show that Hm Rn+i(x) = 0, irrespective of the value of x. 

n— »oo 

5.4 Now explain why ^2^=0 It converges to e^. 



□ 



It is unfortunately not the case that the Taylor series of a function always converges to 
the correct value, however. Here is an example: 



Example 5.2.16 Define / : M ^ M by 



e ^ if X 7^ 
if X = 



We claim that the Taylor series for f{x) is Yl'^^=o ^ Jf^ ~ ^' because f^'^\0) = for all n. 
Thus the Taylor series does converge for every x, but it never converges to the correct value 

g-i/x ^ except at a: = 0. 

It is not hard to see that / is continuous, i.e. that lim^^^o f{x) = 0. Note that if x / 0, 
then 

fix) = ^/(x) fix) = - ^)/(x) 

etc., and that, generally 

/(")(x)=Pn(^)/(x) x/0 

where Pn{x) is a polynomial. Now by L'Hopital's Rule, 



lim "^^ ^ = lim = lim /' „ = 

x->o X x->o X x^o e^l^ 

Assume now that lima;_»o = holds for A; = 0, 1, . . . , n. Then, again by L'Hopital's Rule, 

x^o x"+i x->Q e^i^ 2 x^o x"~^ 

By induction, and taking linear combinations, it follows that 

lini P(i)/(x) = for any polynomial P 

Now 

/'(O) = hm = 
Assume now that f'^O) = for A; = 1, . . . , ra. Then 

■' ^ ^ h-*o h 

h^O h 
= hm Qn{l/h)f{h) 

= 

because Qnix) ■= xP„(x) is a polynomial. By induction, /^"^(O) = for all n. 

□ 



116 Mean Value Theorems 



Appendix A 

Logic, Sets and Functions 

A.l Logic and Formal Language 

We introduce here a formal language for talking about mathematical objects. This lan- 
guage is very precise, and unambiguous — properties which are largely absent from spoken 
languages such as English, but obviously essential for mathematics. But, as a result, this 
language is rather restricted in scope. The reason we use it is to make certain statements 
amenable to logical analysis. The purpose of logical analysis is to decide whether a par- 
ticular sentence/expression (e.g. about mathematical objects) is true (T) or false (F). A 
sentence/expression that is either true or false (but not both!) is called a statement. 

Example A.1.1 Here are some typical examples of statements: 

• 1 + 1 = 3. 

• All apples are red. 

• The equation + 2x + 1 = has a real root. 

• Either + a = has a real root, or a > 0. 

• There exist are infinitely many prime numbers. 

• Every continuous function is differentiable. 
Note that a mathematical statement need not be true. 

□ 

Exercise A. 1.2 Which of the following are statements? For each statement, try to decide 
whether it is true or false. 

1. 1 + 1 

2. 3 is greater than 0. 

3. a/2 is an irrational number. 

4. - 1 = 0. 



117 



118 



Logic and Formal Language 



5. If x = 1, then - 1 = 0. 

6. If - 1 = 0, then x = 1. 

7. The moon is made of cheese. 

8. The moon is a tasty snack. 

9. If the moon is made of cheese, then the moon is a tasty snack. 

10. The sentence ^ defined by 

^ = "The sentence <p is false" 

11. All unicorns are white. 

12. All unicorns are pink. 

□ 

More complicated statements in our formal language are built up from a collection of symbols, 
including amongst others 

• Symbols for objects and relations; 

• Logical Connectives; 

• Quantifiers; 

We will briefly discuss each of these in turn. None of this material is difficult, though it may 
take a little while to get used to. 

A. 1.1 Symbols denoting Objects, Operations and Relations 

When doing mathematics, we use symbols to denote certain mathematical objects, operations 
and relations. For example, the expression 

X + 3 < x/tt 

contains the following symbols: 

(i) Symbols denoting fixed objects, namely the constants 3 and tt; 

(ii) A symbol denoting a variable object, namely x; 

(iii) Symbols denoting operations, namely +, 

(iv) A symbol denoting a relationship, namely <; 
So our language will contain symbols for 

• Variables: Typically we use the symbols x, y, z,xi,X2,X3 . . . 

• Constants: e.g. 0, 1, 2, . . . , or tt, etc. 

• Functions/Operations: +, •, ^, U, fl 

• Properties and Relations: e.g. =, <, >, G, C, etc. 



Logic, Sets and Functions 



119 



A. 1.2 Logical Connectives 

Once we are able to make basic statements such as 1 > and x = 3, we are able to combine 
them using the logical connectives and, or, implies (then), not to make new statements such 
as 

(1 > 0) and (x = 3); If x > then y = 1; x^O 



A 


and 


— 1 


not 




implies, then 


V 


or 


■(-^ 


if and only if 



In our formal language, these connectives have precise meanings: If (f>, tp denote statements, 
then 

(f) Alp is true <^=^ both (f), are true. 

^ V V' is true <^=^ at least one of 4>, ijj is true, perhaps both. 
(f) ^ ip is true <^=^ whenever cj) is true, so is ijj 

i.e. it is not the case that (j) is true but tp is false. 
(f) ip is true <^=^ if both (f) ^ ip , ip ^ (j) are true 

i.e. if (p, ip are simultaneously true, or when they are simultaneously false. 
-1^ is true <^=^ (j) is false. 

Here is a truth table for the logical connectives: 







(p Alp 


(pVtp 


(p^ip 


(f) ■(-^ ip 




T 


T 


T 


T 


T 


T 


F 


T 


F 


F 


T 


F 


F 


F 


F 


T 


F 


T 


T 


F 


T 


F 


F 


F 


F 


T 


T 


T 



This means, for example, that if (j) is true and tp is false — in the second row of the table — 
then (/) Alp is false, (p\/ ip is true, (p ^ ip is true, etc. 

Now it is extremely important to note that the logical use of and A, or V, and implies 
— though related to their common usage in English, is certainly not identical to it. In 
particular the truth value T or F of an expression such as (p Aip, <pV ip, (p ^ ip etc. depends 
only on the truth values of (p and ip, and not on any meaning that the statements (p, ip might 
possess! Let us discuss some of the pitfalls: 



• And, A: 



4> 


iP 


(p Alp 


T 


T 


T 


T 


F 


F 


F 


T 


F 


F 


F 


F 



To say that ip A ip simply means that both (p and ip are true. It does not assert any 
connection (causal or otherwise) between (f> and ip. This is not typically true in English. 
With the English and, the following sentences have rather different meanings, but with 
the logical and they mean the same thing: 



120 



Logic and Formal Language 



1. Alice got drunk and failed her test. 

2. Alice failed her test and got drunk. 



• Or, V: 








p V V' 


T 


T 


T 


T 


F 


T 


F 


T 


T 


F 


F 


F 



V is true precisely when at least one of V' is true, possibly both. In particular, it 
is not exclusive-or ("either. . . , or. . . "). Thus the statement 



(1 > 0) V (5 is a prime number) 



is true. 



• Implies, Then, 



<p 




o V ijj 


T 


T 


T 


T 


F 


F 


F 


T 


T 


F 


F 


T 



The statement (p ^ is true if whenever (p is true, then so is ip. In particular. 



(f) ^ tp is false if and only if cf) is true but tp is false. 

There are severe differences between the English usage and the mathematical usage of 
implies. In English usage, implies (or then) usually involves a causal connection, as 
in "If it is raining, then it is wet outside." It is wet because of the rain. But such a 
connection is irrelevant for the logical then. For example, the statement 

(1 > 0) ^ (5 is a prime number) 

is true. Of course, the reason that 5 is prime is not because of the fact that 1 > 0!! 
There is no causal connection. 

We repeat: A logical (p ^ statement is false only when (p is true and ^ is false — just 
look at the truth table. 

— In particular, if tp is true, then (p ^ ^p is also true, no matter what (p might be. 

— Even more surprisingly, if (p is false, then (p ^ ip is true, i.e. a false statement 
implies any other statement] In particular 

(0 = 1) — > (The Moon is made of cheese) 

is true. 

Exercise A. 1.3 (a) Two statements P, Q are said to be logically equivalent — and we write 
this as P <^=^ Q — if and only if P, Q have the same truth value. There is an algorithm 



Logic, Sets and Functions 



121 



to check if two statements are logically equivalent: Simply construct a truth table for 
P, Q and show that the truth values for P, Q are always the samej^ Show that 



that 
that 

and that 

[Hint: For the first equivalence, construct the truth table 














-nip 


(-0) A 


T 


T 












T 


F 












F 


T 












F 


F 













The truth value entries in the ^{(j) V V')~column and the (-'i;^') A (-i^/;) -column should be 
identical (or else you've made a mistake). This means that -■((/) V ip) is true precisely 
when (-■(/>) A {^ip) is true, and hence that the statements are equivalent. Repeat for the 
other equivalences that must be shown.] 

(b) Show that 

(0 ^ ^) ^ (^V') - 

This is important in proofs: To show that ip follows from (f) it is enough to show that if 
ip fails to be true, then (p also fails to be true. 



□ 



Exercise A. 1.4 A proof by contradiction works as follows: To prove that a statement P 
is true, it is enough to show that there is a known false statement Q so that -iP — > Q is true, 
i.e. so that assuming that P is not true leads to a false statement. We may then conclude 
that P is true (for if P were false, then -iP would be true, and since -iP ^ Q is true, we may 
conclude that Q is true — contradicting the fact that Q is known to be false.) 

Can you demonstrate the above reasoning using a truth table? 
[Hint: Construct a truth table as above, and then remove rows that contradict what you 
know. You know Q is false, so you can remove rows in which Q is true. You know -iP Q 
is true, so . . . ] 

□ 



^This method probably appears first in Ludwig Wittgenstein's Tractatus Logico-Philosophicus, but he 
undoubtedly cribbed the idea from Gottlob Frege's Begriffschrift. 



122 



Logic and Formal Language 



A. 1.3 Quantifiers 

Many mathematical statements assert the existence of a mathematical object with certain 
properties. For example to say that 

— 1 = has a real root 

is to say that there exists a real number c such that — 1 = 0. 

Other mathematical statements assert that something is true for all objects (of a prespecified 
type), for example 

For every real number x, x^ > 0. 
We therefore introduce the following symbols for quantifiers: 



For all 
There exists 



A quantifier always occurs in conjunction with a variable, i.e. as Vx or as 3x. Thus if 
^(a;) is a statement about x, then 

'ix<p{x) is true iff the statement ^(x) is true for every x 

Frequently, if we want to restrict the domain to a particular set X, we may also write Vx G 
X (f){x) OT^xeX Thus 

{3x G X)(j){x) is true iff there is at least one x G X for which the statement is true 

Thus the statement 3x € M(a;^ — 1 = 0) asserts that the equation x^ — 1 = has a real 
root. 

The statement Vx G M(x^ > 0) asserts that the square of any real number is non-negative. 

Exercise A. 1.5 Decide if the following sentences about real numbers are true or false: 

(a) 3x G M(x2 = -1) 

(b) 3x G N(4x = 1) 

(c) 3x G M(4x = 1) 

(d) Vx G M 3y G M(x < y) 

(e) 3y G M Vx G M(x < y) 

(f) 32/G [0,1] VxG [0,l](x<y) 

(g) Vx G M Vy G M[xy = ^ (x = V y = 0)] 

(h) \/x eRyy eR3z eR[x + z = y] 

(i) 3zeRyxeRyyeR[x + z = y] 

□ 

Exercise A. 1.6 Rewrite the following sentences about numbers using logical notation. 



Logic, Sets and Functions 



123 



(a) The integer x is an even number. [Hint: An integer x is even if and only if there is an 
integer y such that x = 2y] 

(b) a; is an odd number. 

(c) Any integer is either odd or even. 

(d) For any positive integer, there is another integer so that their sum is negative. 

(e) a; is a rational number. 

(f) \/2 is an irrational number. 

□ 

Note that we have the following equivalence of statements: 

-'(yxip{x)^ 3x{-i(f{x)) 

For if it isn't the case that the statement ip{x) is true for every x, then there is at least one 
x for which the statement ip{x) is false, and thus for which ^<^{x) is true. 

Exercise A. 1.7 Verduidelik waarom 

^(3xip{x)^ <^=^ Va;(-i(^) 

□ 

Thus a negation sign can "creep" past a quantifier, but it flips the quantifier in the process. 
For example, 

-[Va;3y(y > x)] ^ ^x^[^y{y > x)] 
^xyy{y x) 



One more thing: The variable x in a statement of the form \lx(j){x) or 3x(j){x) is unimpor- 
tant, i.e. the meaning of the statement remains the same if we change the variable (provided 
that the new variable docs not already occur in the statement (f)). This is just like what 
happened for definite integrals: For example, we have 



b r-b 

f{x) dx= f{y) dy 

J a 



Just SO, we have 

\/x(j){x) \/y4>{y) and ■3x(j){x) <^=^ 3y(j){y) 



provided y does not already occur in 0. 



124 



Sets, Functions and Relations 



A. 2 Sets, Functions and Relations 

The philosophical debate about the nature of mathematical objects was given a boost when 

it became generally accepted (in the early 20th century) that, in principle, all mathematical 
objects "should" be sets and mathematical notions "should" be expressible as relationships 
between sets. This means, for example, that \/2 is a set!! Actually, you mustn't take this too 
literally — What is meant is that set theory is flexible enough to interpret all mathematical 
objects as sets. We have mentioned before that the first satisfactory answers to the question 
"What is a real number?" were given independently, but nearly simultaneously (1872) by 
Cantor and Dedekind: 

• Cantor: A real number a is a certain set of sequences of rational numbers (namely the 
set of all such sequence that converge to a — but the definition can be phrased in such 
a way as to remove the circularity) . 

• Dedekind: A real number a is the a certain set of rational numbers (namely the set 
{x G Q : X < a}, but again the definition can be made non-circular). 

The point is that both these approaches construct an object — a complete ordered field — that 
behaves just like the real numbers. The ingredients in the construction are simpler objects, 
namely rational numbers. Both approaches provide a concrete construction of an object that 
behaves just like the the set of reals. And we do not really care what real numbers are, but 
only how they behave and interrelate. [In the same way, a chess player doesn't care what a 
chess piece is. Whether a piece is made of wood, or plastic, or appears on a computer screen 
is completely irrelevant. What matters to the chess player is how the piece behaves, i.e. how 
the rules (axioms) allow it to interact with other pieces on the board.] 

In the same way, almost any other mathematical object can be interpreted as a set, in 
some way or another. For this reason, every mathematician needs just a little set theory. The 
material in this section is not difficult, and no doubt you have seen it much of it before. 



Intuitively, a set is just a collection of objects. 



If A is a set and x is some mathematical object, we say that 
X £ A (x is an element of A) 

if X is amongst the objects collected in A, and we write 

X ^ A 

if it isn't. 

The idea is that a set is characterized entirely by its elements. Thus if two sets A and B 
have exactly the same elements, then we must have A = B. For example, the sets A = {a} 
and B = {a, a} have the same elements, namely only a. Thus A = B. The fact that B seems 
to have two copies of a is immaterial. 

Here are some remarks for the philosophically minded: 

• Any definition involves some terms, and you can always ask for a definition of those terms. Those 
definitions will involve further terms, whose definition you can ask for. . . leading to either an infinite 

regress or circularity. We have to start somewhere, and we regard the notions of set an element as so 
bEisic that we need not define them: "We all know what is meant." 

And indeed, the idea of forming sets of objects is basic: If you want to count some objects, you first 

have to decide which objects you want to count, i.e. you first have to (mentally) put those objects in a 
set before you can count them. Forming sets is even more basic, therefore, than counting!! 



Logic, Sets and Functions 



125 



It is absolutely remarkable that starting with just this undefined notion we can rigorously develop almost 
all mathematics, and certainly all applied mathematics. 

• Saying that two sets are equal if and only if they have the same elements means, for example, that 

{Evening Star} — {Morning Star} 

as both sets are equal to the {planet Venus}. Yet the Evening Star is seen only in the evening, whereas 
the Morning Star is seen only in the morning. . . 

Instead of set, we will also sometimes say class, collection or family; instead of saying x is 
an element of A we will sometimes say x is a member of A oi x belongs to A. 
There are two ways to represent sets: 

(i) By listing its elements, and 

(ii) By some defining property. 

For example, if a set A has finitely many elements oi, . . . , o„ then it can be represented by 
A = {ai, 02, ... , an}. On the other hand if A is the set of all x having a certain property 
P{x), then A can be denoted hy A = {x : P{x)}. 
In analysis, the following sets are important: 

• The set of natural numbers N = {0, 1,2,3,...} 

• The set of integers or whole numbers Z = {. . . , —2, —1,0,1,2,...} 

• The set of rational numbers Q = : n, m G Z, m 7^ 0} 

• The set of real numbers M, and the set of non-negative real numbers is denoted by 

• The set of complex numbers C = {a + ib : a,b & R} 

Example A.2.1 • The set A of all integers between -1 and 3 can be represented in two 
ways: 

(i) ^ = {-1,0,1,2,3} 

(ii) A = {n : ra is an integer and — 1 < n < 3} 

• Q = {x G M : 3ra G Z (nx G Z)} 

• {^2} = {x G M : X > A = 2}. 

□ 

A set need not have any elements: 



Definition A. 2. 2 We define the empty set to be the set with no members, and denote it 
by the symbol 0. 



The empty set plays roughly the same role in set theory that the number zero plays in 
ordinary mathematics. 



126 



Sets, Functions and Relations 




Figure A.l: Venn diagram illustrating inclusion AC. B. 

Exercise A. 2. 3 In the above definition, wc speak of "the" emptyset. Explain why there is 
only one empty set, and not many. To be more concrete, note that both the sets {x : x G 
M and < 0} and {x : x ^ x} have no elements. Explain why they are the same set. 

□ 

Before we continue, please note the following common error: 

A^{A} 

e.g. 

The sot on the left has no elements, whereas the set on the right has one clement, namely 0. 

Definition A. 2. 4 We say that a set ^ is a subset of another set B, and write 

ACB 

if and only if every element of A is also an element of B. 

We say that ^ is a proper subset of if ^ is subset of B, but A^ B. 

We may also write B ^ A instead oi A C B; they mean the same thing (just as x < y and 
y > X mean the same thing). 

Remarks A. 2. 5 Note that A = B ii and only ii A C B and B C A. 

□ 

Exercises A.2.6 (1) List all the subsets of the set A := {0, 1, 2, {1,2}}, 

(2) Prove that is a subset of every set. 

[Hint: Give a proof by contradiction. Assume that there is a set A such that % A.] 

(3) Show formally that if A C S and if B C C, then ACC. 

(4) Prove (by induction or otherwise) that if a finite set A has n elements, then it has 2" 
distinct subsets. 

□ 



Logic, Sets and Functions 



127 









c 




3 



Figure A. 2: Venn diagram illustrating intersection An B. 




Figure A. 3: Venn diagram illustrating union Au B. 
A. 2.1 Operations on sets 

There are several ways of combining sets to form new sets. In this section we define and 
give some examples of the set-operations union, intersection, difference, complementation, 
cartesian product and power set formation. 

Definition A. 2. 7 (Union, intersection and difference of two sets) 
Suppose that A, B are sets. 

(a) The union of A and B is the set of all elements which are either in or in iJ (or 
both). 

Au B = {x : X e Ay X e B} 

(b) The intersection of A and B is the set of all elements which belong to both A and B. 

AnB = {x: xeAAxeB} 

(c) The set difference of A and B is the set of all elements which belong to A, but not to 
B. 

A-B = {x: xeAAx^B} 

Two sets A, B are said to be disjoint if they have no members in common, i.e. if AnB = 0. 
In that case. A - B = A,B - A = B. 

Often we work within some universe, which is just the set of all objects under consideration 
at that time. The sets that we deal with are then typically subsets of the universe. 
Which set is the universe depends very much on context. If one is deahng with real numbers, the obvious 
choice of universe is R, but if one is dealing with complex numbers as well, then it would be C. If one is 
trying to find the solution of an n"^ order differential equation, then the universe will generally be the set of 
all n-times differentiable functions. In probability theory, the sample space Q acts as universe. 



128 



Sets, Functions and Relations 







g 


:) 



Figure A.4: Venn diagram illustrating set difference A — B. 




Figure A. 5: Venn diagram illustrating complementation A'^. 



Given a universe, we also have a unary operation on sets, called complementation. 



Definition A. 2. 8 Let the universe be f2, and let AC.^. The complement of A is the set 
of all elements in the universe which are not in A. 

A'' = {x&n:x^A} 



Note that A" = 9,- A. Also note that A - B = Ar^B". 
Here are some standard identities involving the operations: 



Logic, Sets and Functions 



129 



Proposition A.2.9 Suppose that A,B,C are subsets of some universe Q. 


(a) Idempotent laws: 




AUA = A; 


An A= A 


(b) Commutative laws: 




AU B = BU A; 


Ar\B = Br\A 


(c) Associative laws: 




(AUB)UC = AU (SUC); 


(A n B) n c = A n n C) 


(d) Distributive laws: 




A n (s u C) = (A n u (A n C); 


A u n C) = (A u B) n (A u C) 


(e) Absorption laws: 




A U (A n S) = A; 


A n (A U S) = A 


(f) Complementation laws: 




A U A'^ = VI; 


A n A'^ = 




= A 


(g) De Morgan's laws: 




(AnS)'= = A'=U5'=; 


(A U BY = AT\B'' 



Note that each of the identities remains true if 



• n and U are interchanged, and 

• and 0, arc interchanged. 

Proof: We show how to prove one of the above laws, and leave the remainder as exercises. 
Let us prove that An {B U C) = {An B) U {An C). 

First suppose that x G A n (5 U C). Then x e A and x e B[JC,hy definition of fl. Thus 

X E A and either (1) x B, or {2) x £ C (or both), by definition of U. Thus either (1) a; G A 
and X e B, or (2) x £ A and x G C It follows that either {1) x £ An B or {2) x e AnC, 
and thus that x £ {An B) U {AnC). We have now shown that ii x £ An {B \J C), then also 
xe{AnB)U{An C), i.e. that 

An{BuC)c{AnB)u{AnC) (*) 

Next, assume that x £ {AnB)\j{An C). Then either {1) x £ An B, or (2) x £ An C. 
In either case, it follows that x £ A. Also we must have either (1) x £ B, or (2) x £ C, 
and thus x £ B U C. We see, therefore, that we have both x £ A and x £ B U C, so that 
x £ An{BUC). It follows that whenever x £ {An B) U {AnC), then also x £ An{B\JC), 
i.e. that 

(A n 5) u (A n C) c A n (5 u C) (t) 

Putting (*) and (f) together, we obtain 

An {B u C) = {An B) u {An C) 



as required. 



130 



Sets, Functions and Relations 



H 

Exercise A. 2. 10 Prove the remaining identities in the proposition above. 

(By the way, drawing a Venn diagram does not constitute a proof! Venn diagrams are drawings in the plane, 
and are reliable only when you are dealing with quite a small number of sets.) 

□ 

A set is completely determined by its elements. The order in which those elements are 
arranged does not matter. For example, {a,b} = {b,a}. When we want the order to matter, 
we have to deal with ordered tuples. An ordered pair is denoted by (a, 6), and should be 
thought of as a collection containing a and b, in that order. Thus (a, b) ^ (6, a). Note that 

(a, b) = (c, d) a = c and b = d 

Generally, an ordered n-tuple is denoted by (01,02, . . . ,0^), and should be thought of as a 
collection containing oi, 02, . . . , a„, in that order. 

The pair (a, 6) is usually defined to be the set {{a}, {a,b}}. You can check that this definition yields the 
required property that (a, b) = (c, d) iS a = c and b = d. 

(o, 6, c) is then defined to be (a, (6, c)) (which is just the set {{a}, {a, {{6}, {6, c}}}}), etc. This is in keeping 

with the notion that all mathematical objects should be sets. On first encounter, however, you might find this 
arbitrary, clumsy, and unnecessary, and you wouldn't be far wrong: The main thing that you need to keep in 
mind is that an ordered tuple is a collection in which the order matters. 

Using ordered tuples, we can define one more way of making new sets from old: 



Definition A. 2. 11 (Cartesian product) Suppose that Ai, A2, . ■ . , An are sets. The carte- 
sian product of ^1, ... , An is the set of all n-tuples (oi, . . . , o„), with each Ok & A^. 

AiX A2X ■■■ X An = {(oi, 02, . . . , a„) : Uk e Ak ior k = 1,2, . . . ,n} 



We will identify the sets {Ax B) xC and Ax {B xC) with Ax B xC, although, strictly 
speaking, they are not equal. 

For example, ((o, 6),c)) is an element of the first set, but not of the second or third, (o, (6, c)) belongs to the 

second, but not to the first or third, (a, b, c) belongs to the third, but not to the first two. However, we shall 
simply identify (a, (6, c)), ((a, 6),c) and (a, 6, c), i.e. we shall not distinguish between them. After all, all that 
matters is the order of a, b, c and that is the same in each of these tuples. 

Thus far, we have considered union, intersection and cartesian product as binary opera- 
tions, involving just two sets. Frequently, however, we may need to consider these as infinitary 
operations: We can, for example, take the union of infinitely many sets. We define the union, 
intersection and cartesian product of a family of sets as follows: 



Logic, Sets and Functions 



131 



Definition A. 2. 12 (Union, intersection and product of a family of sets) 
If v4 = {Ai : i G /} is a family of sets, we may define 

(a) the union 

\^A=[^Ai = {x:xEAiioj: some i E 1} 

iei 

(b) the intersection 

f]A = f]Ai = {x : X e Ai ioi all i e 1} 

(c) the cartesian product 

Yl^ = Yl^i = {(a*)/ -ai^A for all i G 1} 

Here (aj)/ is a generalized tuple, indexed by /. 

In essence, (o,)/ is a function with domain I and range |J Ai. We will return to this later. 

iei 

oo 

We will frequently write [JAi or jj^ Ai instead of [j Ai. We will also write (J An instead 

I iei n=l 

of U An. The same holds for f] and H- 



Remarks A.2.13 Note that 

(i) \J{A,B}=AUB 

(ii) n{^. B,C} = AxBxC 

(iii) n{^i , Xa, . . . , x„} = Xi n X2 n • • • n x„ 

etc. 

□ 

Exercise A. 2.14 Let Ai,A2, A^, ... , An, ... be a sequence of subsets of a fixed set CI. For 
X E ft, we say that 

X G An eventually (ev.) 

if X belongs to all the ^4,, from some point onwards, i.e. if there exists an € N such that 
X G An for all n > N. (Then x belongs to all the An from onwards.) Let (A„,ev) denote 
the set of all x such that x belongs to An eventually, i.e. 

{An,ev.) := {x G ri : a; G An, ev.} 



Similarly, we say that 

X e An infinitely often (i.o.) 

if X belongs to infinitely many of the sets An, or, more accurately, if there are infinitely many 
n G N such that x G Let (A„,i.o.) denote the set of all x such that x belongs to An 
infinitely often, i.e. 

{An,i.o.) = {x e Q : X e ^^,1.0.} 



132 



Sets, Functions and Relations 



(a) Explain why (A„,ev.) C (^„,i.o.). 

(b) Explain why the following is true: 

X G An, ev. 3N e NVn > A^(a; G A^) 

(c) Explain why we may express {An, ev.) as follows: 

iAn,ev.) = [j f] An 

NeN n>N 

[Hint: Try to understand the following reasoning: For N eN, define Bn := nn>Ar^n = 

An n An+i n n Then x G UiVeN r\n>N iff x G UAfgN iff there is some 

N such that x G Sat. (Why?) Now a; G -Bat iff x G ^„ for each n>N. (Why?)] 

(d) Explain why the following is true: 

X G An, i.o <S=^ VAT e N3ra > N{x G A„) 

[Hint: If x G A„ i.o., then for each N, there must he n > N such that x e A^. For if 
this were not so, then there would be some N such that x An for any n > N. But then 
X can belong only to those ^„ for n G {1, 2, . . . , iV — 1}, i.e. to only finitely many of the 

An.] 

(e) Explain why we may express {An, i.o.) as follows: 

{An,i.O.) = f] \J An 

Nm n>N 

(f) Explain why 

{An, ev.y = {A^n, i-o-) and {An, lo.y = (A^, ev.)'^ 
Do this in two ways: via logic, and via set theory. 

□ 

Here is another way of making new sets from old: Given a particular set, one should be 
able to collect all of its subsets together into a new set, called the power set. 

Definition A. 2. 15 (Power set) 

If A is a set, then the power set of A is the set of all subsets of A. 

V{A) = {B -.BCA} 



Note that 0, A G 'P{A). They are, respectively, its smallest and biggest members. 



Logic, Sets and Functions 



133 



A. 2. 2 Functions 

Originally, a function was regarded as a rule (or a formula, or an algorithm) for associating 
one real number with another. For example, 

fix) = 2x^ 

explicitly shows how to calculate a number f{x) which is to be associated with x: First cube 
X, and then multiply the resultant by 2. However, this original formulation proved to be 
unduly restrictive. For one thing, Fourier showed that practically any continuous curve of 
finite length could be give a "formula" as an infinite trigonometric series. For another, we 
may want to associate numbers with other mathematical objects, or one kind of mathematical 
object with another — there is no reason to restrict ourselves solely to numbers. 

For example, we may want to associate with each rectangle its area. Thus we have a function which assigns a 
number to each rectangle. 

Or, we may want to assign to each subset of R its power set. This yields a function which assigns a set to each 
set. 

Thus a general definition of function dispenses with the idea that it is a rule, but keeps 
the idea of associating one object with another: 

Definition A. 2. 16 Let A,B be sets. A function (or map) f from A to B, written 

f -.A^B or A^B 
is a subset of the cartesian product Ax B with the following property: 

for each a & A there exists exactly one b & B such that (o, b) E f 
In that case write 

/(a) = b instead of (a, b) & f 

We call b the image (or value) of a under /, and call a a preimage of b. We also say that 
a maps to b under /. 

The set A is called the domain of /, and the set B is called the codomain of / 

A = dom(/) B = codom(/) 
The range of / is the set of all possible values of /, and denoted ran(/). 

Essentially, this concept of function is arrived at by deliberately confusing a function with 
its graph. For example, the graph of the function / : M — ^ M : x i— 2x^ is a curve in the 
cartesian plane. This curve is therefore a set of ordered pairs: 

Graph(/) = {(x,y) : y = 2x^} 

For example, the points (0, 0), (1, 2), (2, 16), (3, 54) belong to the graph. Now we assert that 
a function is its graph. Thus the function f{x) = 2x^ is nothing but the set {{x, y) ■ y = 
2x^} C M X M. 



134 



Sets, Functions and Relations 



Examples A. 2. 17 You've already met more than just a few functions in your mathematical 
education up to date. The most obvious ones are functions from M" to M™, such as f{x) = 
x'^,g{x, y) = sin(a;^ + y), h{x, y, z) = {xy, x In etc. Here are a few more that you might not 
yet have considered as functions: 

(a) Define Z — > by: /(n) = {m : m divides n}. Then / is a function which maps a 

number to a set. For example, 

/(12) = {±1, ±2, ±3, ±4, ±6, ±12} = /(-12) 



(b) Let CO(M 

Ja 

integral. 



= {/ : / is a continuous map from M to M}, and let a < 6 G M. Then 
I — ^ M is a function which assigns to every continuous map its definite 



(c) Let Ci(M,M) be the set of all maps from R to M which have continuous first derivatives. 
Then the derivative operator is a map D : C^(M,M) — > C°(M,M). 

(d) curl is a map from the set of vector fields on to itself, div is a map from the set of 
vector fields on to the set of functions on M'^ M. grad is a map from the set of 
differentiablc functions ^ M to the set of vector fields on M^. 

(e) An n X m matrix A can be regarded as a map from A : M*" — > M". 

(f) Addition and multiplication are functions from to M. Addition can, in fact, be de- 
scribed by the 1 x 2-matrix (1 1), for (1 1) (^) = a + b. 

(g) If n is a universal set, then union and intersection can be regarded as functions from 
V{n) X to which map the ordered pair {A, B) to A\JB and Ar\B respectively. 

(h) We can also regard the bigger version |J of union as a map, but this time we have 
U : V{V{^)) — >■ V{^). It assigns to any family of subsets of $7 its union. (Note that a 
family of subsets of O is just a set of elements of V{^), i.e. it is a subset of 'P(ri), and 
therefore an element of V{V{^)).) The same goes for intersection. 

□ 

For any set A, there is an important function on A called the identity function. It is 
denoted by id^, and is defined by 

id^ : A — > A idA(a) = a 

Thus id^ = {(a, a) : a G A}. 

Examples A. 2. 18 (a) The identity function on M is just the function y = x. 
(b) The identity function on is the identity matrix 





( 1 





. 









1 


. 


. 


In = 








1 . 


. 









. 


• 1/ 



Logic, Sets and Functions 



135 



□ 

Definition A. 2. 19 Let f : A ^ B. li A' C. A, we can define the restriction of / to A' as 
follows: 

f\A' is a map from A' to B, such that {f\A'){a) = f{a) for all a e A' 



Definition A.2.20 Let ^ ^ 5 be a function. 

(a) / is said to be one-to-one (or 1-1, or injective) if and only if the following condition 
holds: 

If /(ai) = /(a2), then ai = 02- 

(b) / is said to be onto (or surjective) if and only if 

For every b ^ B there exists an a £ A such that /(a) = b. 

(c) / is said to be a bijection (or a one-to-one correspondence) if it is both an injection 
and a surjection. 

RemEirks A. 2. 21 A function / : ^4 ^ i? is injective if no two distinct members of a map to 

the same b £ B, i.e. if every b £ B has at m,ost one preimage. 

/ is surjective if and only if every b\n.B gets mapped onto by some a G A, i.e. if every b £ B 
has at least one preimage. In that case B is the range of /, i.e. ran(/) = codom(/). 
/ is a bijection if and only if every b £ B has exactly one preimage. 

It should be clear that there is a bijection from a finite set A to another set B if and only 
if A and B have the same number of elements. 

□ 

Examples A. 2. 22 (a) Let f{x) = x^. We would generally regard / as a function with 

domain R and codomain M. The range of / is [0, +00), since / takes no negative values. 
/ is not injective, because, for example /(I) = /(—I). / is not surjective either, since — 1 
is not in the range of /. 

(b) If we define g{x) : [0, 1] — > [0, 1] by g{x) = x^, then we may regard g as the restriction 
of / to [0, 1], i.e. g = f\[0, 1]. Now g is clearly a bijection. 

(c) x^ : M — > M is a bijection. 

(d) Let Q+ denote the set of all non-negative rational numbers. The map /i : Z X N — > Q"*" 
defined by h{n,m) = ^ is surjective, but not injective. 

(e) If A C B, then the inclusion f : A ^ B defined by: /(a) = a is an injection. It is a 
bijection if and only if A = B. 

(f) Let A be an n X n-matrix, regarded as a map from to M". Then A is injective if and 
only if det(A) / 0. 

□ 

Next, we discuss how functions can be combined: 



136 



Sets, Functions and Relations 



Definition A. 2. 23 If f : A ^ B and g : B ^ C, then ^ o / is a function from A to C, 
defined by 

{9of){a)=g{f{a)) 



Note that the composition g o f does in one step what / and g do in two: 

A^B^C a^f{a)^g{f{a)) 
A'^C atlgifia)) 
Also note that g o f means: 

Do / first, then g 

i.e. the last shall be first. 

An often used fact is that composition is an associative operation on functions, i.e. 

ho {go f) = {ho g) o f 

By this equation we mean that: one side is defined if and only if the other side is defined, 

and in that case they are equal. 

For [f A ^ B,B C, and C D, then h o {g o f) is a function from A to D which 
works as follows: First do g o f\ then do h. But to do 3 o /, you must first do /, then g. The 
combined result is 

First do /, then g, and then h: {ho {g o f)){a) = h{g{f{a))) 

Similarly, {ho g) o f is a, function from A to D which works as follows: First do /, then hog. 
But to do /i o g, you must first do g, then h. The combined result is therefore 

First do /, then g, and then h: {{h o g) o f){a) = h{g{f{a))) 

and thus h o {g o f) = {h o g) o f\ as claimed. 

Example A.2.24 Consider the following functions (note their domains and codomains): 

M ^ M+ : ,T ^ .x^ + 1 
M+ ^ M+ : y ^ 



Then 



and thus 



[—1, 1] : z 1-^ sin(z) 



M M+ : a; ^ + 1 
[-1,1] :yi-^sin(^) 



/^"i^-^) [-l,l]:ar^sin(V'^^) 
^'^^f [-l,l]:x^sin(V^^) 

□ 



Logic, Sets and Functions 



137 



Exercises A.2.25 (1) Let / : N ^ N : m-^- and let : N ^ N : n ^ n + 2. Calculate 
{fog){5) and 5fo/(5). 
Write down formulas for fog and g o f. 

(2) Suppose that f{x) = and g{x) = x + 3. Calculate g o /(x) and / o g{x). Note that 
9° f f °a- 

(3) If ^ is an n X m-matrix, and i? is an m x r-matrix, then we can regard them as functions 

4 ]^n^ _B rpj^^ composition A o 5 is therefore a map M*" ^ M". It is not hard 
to show that the composition is just the matrix product, i.e. that Ao B = AB. Do so! 

(4) Suppose that go fi = go f2- Prove that \i g is injective then we can "cancel" g to conclude 
/i = /2. Give an example to show that left-cancellation may fail if g is not injective. 

(5) Suppose that gio f = g20 f. Prove that if / is surjective then we can "cancel" / to obtain 
9i = 92- Show that right-cancellation may fail if / is not surjective. 

□ 

Note that if f : A — > B, then / o id^ = /, and id^ o / = /. Thus the identity function 
behaves like an identity element for the operation of composition. 

The number is an identity element for the operation of addition, because x + = x. 
The number 1 is an identity element for the operation of multiplication, because x ■ 1 — x. 

Next, we tackle the idea of inverting (or reversing) the effect of a function. Take the func- 
tion f{x) = 3x. It transforms the number x into the number 3x. To undo this transformation, 
you just multiply 3x by |. The function g{x) = |x inverts the effect of /, in that 

gof{x) = X fog{y)=y 

Thus applying first /, and then g gets you back to the starting point x. The same holds true 
if you apply g first, and then /. 

Can every function be inverted? No, as is easy to see: Consider the function f{x) = x^. 
Then /(2) = 4 = /(— 2). Now if 51 is a function which reverses the effect of /, then we cannot 
decide whether g{4:) = 2 or ^'(4) = —2. The problem arises because g is not 1-1. 

Let's make the preceding discussion precise: 

Definition A. 2. 26 Let f : A ^ B. We say that / is invertihle if and only if there is a 
function g : B ^ A such that 

g{f{a)) = a for all a e A, f{g{b)) = b for all 6 G S (*) 

The function g, if it exists, is called the inverse of /, and denoted g = f~^. Then (*) 
amounts to saying 

/"^ o / = id^ and / o /"^ = ids 
Note that if f~^ exists, then 

f~^{b) = a if and only if /(a) = h 
Proposition A.2.27 A function f : A — > B is invertihle if and only if it is a bijection. 



138 



Sets, Functions and Relations 



Proof: Suppose that / is invertible, i.e. that /~ exists. Then /~ is a function from B to 
A. We first show that / is surjective: Let b ^ B. Since the domain is B, f~^{b) must be 
defined, i.e. there must be some a e A such that f~^{b) = a. But then /(a) = b. Hence every 
b e B has a preimage. 

Next we show that / is injective. For suppose that /(ai) = /(a2) = b. Then f~^{b) = ai and 
f~^{b) = 02. Since is a function, we must have ai = 02 (check the definition of function), 
and hence / is injective. 

This proves that if / is invertible, then / is a bijection. 

Now we prove the converse. If / is a bijection, then it is onto B. Hence for every b £ B 
there is some a & A such that /(a) = b. Moreover, since / is one-to-one, that a has to be 
unique. So we may define f~^{b) to be the unique a such that /(a) = b. This makes into 
a well-defined function f"^ : B ^ A. 

H 

Examples A. 2. 28 (a) The function f{x) = is a bijection on the reals, and its inverse is 

g{x) = -y/x. 

(b) The function f{x) = x"^ does not have an inverse, since it is not a bijection. However, if 
we restrict f to the non-negative reals, then f\M~^ is a bijection. Its inverse is the square 
root function. 

(c) The function / : M — > (0, -|-oo) defined by f{x) = is bijective. Its inverse is the 
natural logarithm Inx. 

(d) The function sinx is neither injective, nor surjective; however, if we restrict sin a; and 
regard it as a function [— f , f ] — > [—1,1], then it is a bijection, and its inverse is arcsinx. 

(e) If A is an n X n-matrbc, regarded as a function on M", then A has an inverse function if 
and only if A has an inverse matrix. Since composition is just matrix multiplication, the 
inverse function of A is just the inverse matrix A~^. 

□ 

Remarks A. 2. 29 Note that, in general, 
e.g. ^• 

The number a;~^ = ^ is the inverse of x under the operation of multiplication, in that 

-1 1 -1 1 
X ■ X =1 X ■ X = 1 

noting that 1 is the identity for muhiplication. 

The function /^^ is the inverse of / under the operation of composition, in that 

/o/-l=id /-lo/ = id 
noting that id is the identity for composition. 

The same notation for inverse, i.e. refers to different operations, so there's no reason to believe that there 
is any relationship between them. 



□ 



Logic, Sets and Functions 



139 



The notion of invertibility can be refined: 
Definition A.2.30 Let f : A ^ B and g : B ^ A. 

(a) g is called a left inverse of f if g o f = 'kIa- 

(b) g is called a right inverse of f if f o g = ids- 

Note that if / is invertible, then f"^ is both a left and a right inverse of /, and vice versa. 
Exercises A.2.31 (1) Prove that a function / has a left inverse if and only if it is injective. 

(2) Prove that a function / has a right inverse if and only if it is surjective. 

(3) Prove that if a function / has a left inverse g and a right inverse h, then / is invertible, 
and g = h. 

(4) Consider / : {a,b,c} {1,2} defined by /(a) = f{b) = l,/(c) = 2. Find two distinct 
right inverses of /. 

(5) Consider the inclusion i : Z — > Q. Construct two distinct left inverses of l. 

□ 

A. 2. 3 Functions Operating On Sets 

We have already noted the confusion that may possibly arise by the two uses of the symbol 
We have but few symbols at our disposal, and many of them must therefore serve more 
than one function. Thus you must always be aware of the context in which a particular symbol 
is used. 

You have to do this when using ordinary language: You know in what sense the newspaper headline 

"School kids make great snacks at fund raiser" 
is meant, even though the other sense offers greater amusement value. 

I say this because we are about to add to the possible confusion. With every function 
f : A ^ B (not necessarily invertible) , we can associate two new functions between the power 
sets of A and B 

/[•] : V{A) V{B) : A' ^{beB: There is a' G A' such that f{a') = b] where A' <^ A 
f-\] : r{B) V{A) -.B'^iaeA: f{a) G B'} where B' C B 

Thus /[•] assigns to each subset ^' of ^ a subset f[A'] C B. Similarly, /~^[-] transforms each 
subset B' of B into a subset f~^[B'] C A. 

We will, for the moment, use square brackets to distinguish the various functions, but will 
drop this convention later. Which function is meant will be clear from context. We shall also 
call f[A'] the direct image of A' along /, and f~^[B'] the inverse image of B' along /. Note 
that 

f[A!\ = set of all images of a G A' 

whereas 

f~^[B'\ = set of all preimages of b E B' 



140 



Sets, Functions and Relations 



Remarks A. 2. 32 Sometimes the notation /~* is used for direct image, and f*~ for inverse image. 

□ 

Inverse images play a very important role in mathematics. It is therefore useful to re- 
member the following: 

a G f~'^[B'] if and only if /(a) G B' 

Similarly, 

b G f[A'] if and only if there is a' G A' such that f{a') = b 
Examples A.2.33 (a) Suppose that f -.R^R : x t-^ x"^. Then 

/[-1,2] = [0,4] ,/[Z] = {0,1,4,9,...}, /[{4}] = {16} 

Also 

/-nO,l] = [-l,l], /-n{4}] = {2,-2,}, ri[{-4}] = 
In each set is transformed into a set. 

(b) Suppose that A = {ai, 02, as}, = {61,62,^3}, and that f : A ^ B is defined by 
/(ai) = /(as) = bi, and /(a2) = 63. Then 

/[{ai}] = flM] = f[{ai,as}] = {b,}, f[{a2}] = 63, f[A] = {61, 63}, /[0] = 
and 

rn{M] = {«2}, f-'[{b2}] = f-'m = ^, f-\B) = f-\{b,,bs}] = A 

□ 

Exercises A.2.34 1. Let f : A^ Bhe & function, and let A' ^ A, B' C B. 

(a) Show that A' C f-^[f[A']] 

(b) Show that B' D /[/"M^']] 

(c) Show that A' = f~^[f[A']] for every A' if and only if / is injective. 

(d) Show that B' = f[f~^[B']] for every B' if and only if / is surjective. 

[Hints: Reason along the following lines: 

(b) If 6 G f[f~'^[B']] then b = f{a) for some a G f~^[B']. But then /(a) G B', and so 
b G B'. 

(c) If a G f'Kfi^']] then /(a) G f[A']. Thus there is o' G A' such that /(a) = f{a'). But 
since / is injective, a = a' , and so a G A'.] 

2. Inverse images preserve the set operations: Let f : A ^ B, and suppose that G, H are 
subsets of B. Then 

(a) UGCH, then f-^[G] C /-^[F]; 

(b) /-MG'ni?] = /-MG]n/-i[i^]; 

(c) f-^[GUH] = f-^[G]Uf-'[H]; 

(d) /-HG-i^] = /-HG]-/-M//]; 



Sets and Logic 



141 



3. Direct images are not quite so well behaved: Let f : B, and suppose that G,H C A. 

(a) Suppose that GQH. Show that /[G] Q f[H]; 

(b) Show that f[G [JH] = f[G] U f[H]; 

(c) Show that f[G D H] C f[G] n f[H]; 

(d) Give an example to show that we may not have f[GnH] = f[G]n f[H]; 

(e) Show that f[G] - f[H] C f[G - H] C f[G]; 

(f) Give an example to show, in (e), that both C's may fail to be ='s. 

□ 

We end this section with some notation: Suppose that A, B are finite sets, and that A 
has n elements, and B m elements. How many functions are there from A to B? 
For each a G ^ we have m choices for the value /(a) G B. Thus there are m" functions from 
AtoB. For that reason 

Definition A. 2. 35 Let A, B be sets. Then we define 

B^ = set of all functions from AtoB 
Some authors use instead of B^. 

□ 

Note that each function / : A — >• S is a subset of A x B. Hence B^ is a set of subsets of 
Ax B, i.e. 5^ G V{V{A x B)). 

A. 3 Countable and Uncountable Sets 

In this section, wc investigate the idea of the size or cardinality of a set. For finite sets, we 
can determine the size of a set by counting its elements. Thus for example, the set {a, 6, c} 
has cardinality 3 (it has 3 elements). We are going to extend this idea of counting to obtain 
the size to infinite sets, and we will show that infinity comes in many sizes. 

Let's explore the idea of counting: For the moment, let n = {1, 2, . . . , n} be the set of the 
first n natural numbers. To say that A = {a, b, c} has 3 elements is equivalent to saying that 
there is a one-to-one correspondence between the sets A and 3. Indeed, this is the heart of 
the idea of counting: When we count the elements of A, we are setting up a bijection between 
A and 3. We go "a first, b second, c third". This is equivalent to a map f : A = 3 defined by 
/(a) = 1, f{b) = 2, /(c) = 3. Thus the idea of counting the elements of a finite set X involves 
finding a bijection between X and some n. If there is a bijection from X to n, then X has n 
elements. 

It is obvious that two finite sets A and A have the same size if and only if there is a 
one-to-one correspondence / : ^ = A. We don't even have to count A and A to know that 
they have the same number of elements. If A = {a,b,c,d} and A = {a, ^,^,6}, then the 
existence of the bijection f : A = A given by 



f{a)=PJ{b) = d,f{c) = a,f{d)=^ 



142 



Countable and Uncountable Sets 



is sufficient to show that A and A have the same number of elements. It doesn't tell us that 
this number is 4. 

Thus two sets have the same size if and only if there is a bijection between them; we can 
bypass the idea of number. This is important, because we cannot actually count infinite sets. 
But we can establish bijective correspondences between infinite sets. We shall adopt this idea 
as our basic idea of size. 

Definition A. 3.1 We define an equivalence relation ~ between sets as follows: If A,B 
are sets, we say that A B li and only if there is a bijection from A to B. If A Ki B, 
we say that A and B have the same cardinality. We may also indicate this by saying 
\A\ = \B\. 

Note that having the same cardinality is an equivalence relation between sets, i.e. that 

(i) 1^1 = 1^1 (Refiexivity) 

(ii) If \A\ = \B\, then \B\ = \A\ (Symmetry) 

(iii) If \A\ = \B\ and \B\ = \C\, then |^| = |C| (Transitivity) 

Exercise A. 3. 2 Prove this assertion. (Note that the assertion is not obvious: When we say 
that 1^1 = \B\, we are not actually claiming that there are two equal numbers. What we are 
saying is that there is a bijection from A to B. To prove (i), for example, you have to find a 
bijection from A to A.) 

□ 

Examples A. 3. 3 (a) Two finite sets have the same cardinality if and only if they have the 
same number of elements. 

(b) For finite sets, if A is a proper subset of B, then \A\ < \B\. This breaks down completely 
for infinite sets. Consider, for example, the sets N and Z. It is certainly true that N C Z. 

However, the map N — > Z defined by 

II n IS even 
if n is odd 

2 

is a bijection: /(I) = 0,/(2) = l,/(3) = -L/(4) = 2, /(5) = -2,/(6) =3.... (Note 
that we are zig-zagging from the positive integers to the negative integers.) Thus N and 
Z have the same cardinality, even though N seems to contain fewer elements than Z. 

(c) We also have |Q| = |N|. This can be seen as follows. Put the set of strictly positive 
rational numbers Q"*" in an array 



1/1 


2/1 


3/1 


4/1 


5/1 


1/2 


2/2 


3/2 


4/2 


5/2 


1/3 


2/3 


3/3 


4/3 


5/3 


1/4 


2/4 


3/4 


4/4 


5/4 


1/5 


2/5 


3/5 


4/5 


5/5 




Sets and Logic 



143 



We can then trace a zig-zag path that moves through all the rational numbers as follows. 
Start at the top line and move diagonally down to the left until you reach the leftmost 
line. Repeat. We thus obtain a sequence 

1 2 1 3 2 1 4 3 2 1 5 
1' T' 2' 1' 2' 3' 1' 2' 3' 4' 

All of the strictly positive rational numbers occur in this sequence, and they all occur 
infinitely many times. For example, |, | . . . lie along the diagonal, and they are all 
equal. To obtain a bijection from N to Q"^, we follow the above sequence of rationals, but 
we omit any number that has already occurred to ensure that the function is one-to-one, 

i.e. we prune away the repeated values. We therefore define the function N — > Q"*" by 

/(I) = \, /(2) = p /(3) = I, /(4) = ^, /(5) = ^, /(6) = ^, . . . 

Note that /(5) / I, which is after /(4) = ^ in the sequence, because | = j has already 
occurred as /(I). Then / is a bijection from N to Q"*". Now even though we haven't 
found a formula for /, it is nevertheless a perfectly good function, and all its values can 
be calculated. Can you see that /(16) = |? 

In the same way, we can set up a bijection g from N to the negative rationals. Just put 
g{n) = —f{n). Finally, we can define a bijection h : N — > Q using f^g and another 
zig-zag: We define 

h{l) = 0, h{2) = /(I), h{2,) = 5(1), /i(4) = /(2), 
/i(5)=g(2),/i(6) = /(3), /i(7) = <7(3),... 

Again, we have no formula for /i, but it is certainly a well-defined function, and all its 
values can be calculated. Check that /i(23) = — g. 

□ 



Definition A. 3. 4 A set A is said to be countable if there is a surjection from N onto A. 

Remarks A. 3. 5 (a) Basically a set A is countable if its elements can be indexed by the 
natural numbers, i.e. if it can be written as ^ = {an : n € N}. For if A is countable and 

not finite, then there is a bijection N — > A, and we can take an = f{n). Conversely, if 
A = {an '■ n G N} is infinite, we can define a bijection from N to A by letting f{n) = 
an (although here some pruning is necessary if the aren't all distinct; see Example 

Age)). 



(b) A set A is countable if and only if it is either finite or can be put into a one-to-one 
correspondence with the natural numbers, i.e. if |^| = n for some n G N, or |^| = |N|. 



(c) In Example A. 3. 3 we proved that the sets Z and Q are countable sets. 



(d) The "zig-zag" technique, used above to prove that the rational numbers are countable, 
is often very useful. 

□ 



144 



Countable and Uncountable Sets 



Exercise A. 3. 6 Prove that the union of countably many countable sets is countable (i.e. 
prove that if An {n £ N) are countable sets, then the set UneN countable as well.) 
[Hint: Zig-zag!] 

□ 

So all the infinite sets we've seen so far are countable (and the finite ones also, of course). A 
very natural question that might occur to you is the following: Are all infinite sets countable? 
The answer is "No!" 

Example A. 3. 7 We show that the unit interval I = [0, 1] is uncountable, i.e. that we cannot 
find an enumeration 

/ = {xn : n gN} 

The proof is by contradiction: Suppose that we can find such an enumeration / = {xi, X2, X3, X4, . . . }, 
i.e that every real number in [0, 1] is equal to Xn for some n. Now every number x„ has a 
decimal expansion of the form 

Xn — O.XnlXn2Xn3XniXn5 ■ ■ ■ 

where Xnm is the m*^ number in the decimal expansion of x„. Of course some real numbers 
have two distinct decimal expansions, a terminating one and a non-terminating one. For 
example, 1.0000 • • • = 0.9999 . . .0 We wih choose the non-terminating decimal expansions for 
our Xn- 

We now create a new real number x from the Xn by a process called diagonalization. We 
choose an G {1, 2, . . . , 9} such that the following hold: 

ai / Xn, a2 / X22, as / 2:33, . . . , a„ / x„„, . . . 

To avoid a situation where we obtain a number x with a terminating decimal expansion, we 
haven't permitted a„ = 0; this is just a technicality. We can now define x: Put 

X = 0.01020304 . . . 

Here comes the heart of the argument: Clearly x £ I = [0, 1]. Now if / can be written as a 
list {xi, X2, X3, . . . }, then there must be some n such that x = Xn- But the first decimal place 
of X differs from the first decimal place of x, since oi 7^ xn; hence x 7^ xi. Similarly, the 
second decimal place of x differs from the second decimal place of X2 , since 02 / X22 ; hence 
X 7^ X2. We can continue in this way to show that x ^ Xn for any n G N, i.e. x is not on the 

list {xi,X2,X3,...}. 

This proves the result! Given any list xi, X2, X3, ... of real numbers in [0, 1], we now have 
a technique for producing a new real number x that is not on the list. It thus follows that no 
such list can contain all the real numbers in [0, 1], i.e. there is no bijection from N to [0, 1]. 

□ 

Hence there are uncountable sets. Clearly M is also uncountable, because otherwise we 
could find an enumeration {ri, r2, r3, . . . } of M. By omitting any reals which are not in [0, 1], 
we could prune this into an enumeration of [0, 1]. 

^An easy way to see this is to note that 1 = 3 x | = 3(0.333 • • • ) = 0.999 .... 



Sets and Logic 



145 



Exercise A.3.8 Show that if A is any set, then \A\ 7^ |7^(^)|. Conclude that P(N) is 
uncountable. (Actually, it can be proved that |M| = |P(N)|.) 

[Hint: Suppose that f : A ^ ^(^) and consider the set B := {a G A : a ^ fio.)}- By 
contradiction, show that B ini/.] 

□ 

Consider that 

1. Q satisfies all the axioms that M does, except for the Completeness Axiom; 

2. Q is countable, but M is uncountable. 

This juxtaposition leads one to suspect that it is the Completeness Axiom which is responsible 
for the uncountability of M. This is indeed the case, as you will see by proving the following 
proposition: 

Proposition A. 3. 9 Let I be a non-empty bounded interval of real numbers, and let {an : 

n £ N} C /. Then there is a p €z I such that p ^ an for all n. 

In particular, I 7^ {a„ : n G N} for any sequence an- Hence I is uncountable. 



Exercise A. 3. 10 We prove Propn. A. 3. 9 



(a) Explain why there is a closed interval /i C / such that ai ^ I\. [Hint: Divide / into three 
closed subintervals of equal length.] 

(b) Explain why there is a closed subinterval I2 ^ l\ such that 02 l^- Exlain why also 
ai /2. 

(c) Now assume that we have found a closed subinterval In such that ai, . . . ,an ^ In- Explain 
why there is a closed subinterval In+i ^ In such that a^+i In+i- Explain also why we 
now have ai, 02, . . . , a„+i In+i- 

(d) We now have constructed a sequence of closed intervals 

h^h^h-'-^In^ .-- 

Explain why n„eN / 0- 

[Hint: This uses the Completeness Axiom: Let Z„ be the left endpoint of /„. Show that 
sup{/n : n G N} exists, and that sup{/„ : n G N} G flneN -^"-l 

(e) Let p G PlneN ^ri- Explain why p £ I, and why p ^ an for any n G N. 

□ 



